---
title: "Text as Behavior"
documentclass: cup-journal
bibliography: text_bib_2025.bib
output: 
  bookdown::pdf_document2:
    keep_tex: TRUE
    latex_engine: pdflatex
toc: FALSE
toc_depth: 1
indent: TRUE
md_extensions: +tex_math_single_backslash+tex_math_dollars
always_allow_html: TRUE
keywords: text analysis, text as data, political behavior, survey research
blinded: 1
nocite: |
  @graham2018revised
  @laaksonen2006survey
  @anes2016full
  @anes2020full
  @anes2024prelim
header-includes:
- \renewcommand{\and}{}
- \AtEndDocument{\gdef\lastpage@putlabel{}}
- \author{Omar Wasow}
- \affiliation{Department of Political Science, UC Berkeley}
- \email{owasow@berkeley.edu}
- \makeatletter\let\author\@gobble\makeatother
- \AtBeginDocument{\footnotetext{I thank Risa Gelles-Watnick and Christina Im for excellent research assistance. I thank Andrew Little, Elena Llaudet, Daniel Masterson, John Konicki, Zach Hertz and Alexander Agadjanian, and three anonymous reviewers for extremely helpful feedback. AI tools used to proofread manuscript.}}
- \keywords{Text analysis, text as data, political behavior, survey methods}
- \usepackage{afterpage}
- \usepackage{placeins}
- \usepackage{dcolumn}
- \usepackage{flafter}
- \usepackage{float}
- \usepackage{placeins}
- \usepackage{longtable}
- \usepackage{csquotes}
- \usepackage{setspace}
- \usepackage{booktabs}
- \usepackage{longtable}
- \usepackage{array}
- \usepackage{multirow}
- \usepackage{wrapfig}
- \usepackage{colortbl}
- \usepackage{pdflscape}
- \usepackage{tabu}
- \usepackage{threeparttable}
- \usepackage{threeparttablex}
- \usepackage[normalem]{ulem}
- \let\thead\relax        % avoid conflict
- \usepackage{makecell}
- \usepackage{tikz}
- \usetikzlibrary{cd}
- \usetikzlibrary{automata,positioning}
- \usetikzlibrary{shapes.multipart}
- \usetikzlibrary{decorations.pathreplacing} % for braces in diagram
- \toks0{\ifvoid\footins\else\suppressfloats[b]\fi} 
- \output\expandafter{\the\toks0\the\output}
- \usepackage[outline]{contour}
- \usepackage[skip=8pt]{caption}
- \newcommand{\fnote}[1]{\footnote{\begin{doublespace}\normalsize{#1}\vspace{-26pt}\end{doublespace}}} % 12 pt, double spaced footnotes 
- \newcolumntype{L}[1]{>{\raggedright\arraybackslash}p{#1}}
- \newcolumntype{R}[1]{>{\raggedleft\let\newline\\\arraybackslash\hspace{0pt}}p{#1}}
- \newcommand\T{\rule{0pt}{2.8ex}}
- \newcommand\B{\rule[-1.4ex]{0pt}{0pt}}
- \newcommand\Bsmall{\rule[-0.7ex]{0pt}{0pt}} 
- \newcolumntype{.}{D{.}{.}{-1}}
- \newcolumntype{d}[1] {D{.}{.}{#1}}
- \newcommand{\tbf}[1]{{\fontseries{b}\selectfont#1}}

abstract: |
  \noindent Text analysis typically focuses on content---such as sentiment or topic---but expression is also a form of effortful action. Building on this insight, I propose using simple features of open-ended tasks to study *text as behavior.* This approach treats expression, such as writing, as cognitively, emotionally and temporally "costly" for subjects but inexpensive for researchers. I show basic statistics like the number of characters can approximate effort and significantly improve estimation of quantities of interest, including the probability of turning out to vote, the likelihood of changing party identification, and psychological states about which a subject may not be fully aware. Further, these methods can convert nonresponse into informative data; validate survey instruments; serve as mechanism checks; be hard for a subject to "game"; work across different languages; and analogize well to real-world situations. In sum, text as behavior can help address a range of issues related to quantifying attitudes and actions.

---

```{r scholr-local-functions, include=FALSE}

# =============================================================================
# Local scholr functions (embedded for replication without package dependency)
# =============================================================================

# --- Inline helpers ---
b <- function(model, var, digits = 2) {
    round(stats::coef(model)[var], digits)
}

p <- function(model, var, digits = 3) {
    s <- summary(model)
    coef_table <- s$coefficients
    p_col <- grep("Pr\\(|p-value|p.value|Pr\\(>", colnames(coef_table), value = TRUE)
    if (length(p_col) > 0) {
        pval <- coef_table[var, p_col[1]]
    } else if (ncol(coef_table) >= 4) {
        pval <- coef_table[var, 4]
    } else {
        stop("Cannot find p-value column in model summary")
    }
    if (pval < 0.001) "< .001" else paste0("= ", sub("^0", "", sprintf(paste0("%.", digits, "f"), pval)))
}

se <- function(model, var, digits = 2) {
    round(summary(model)$coefficients[var, "Std. Error"], digits)
}

or <- function(model, var, digits = 2) {
    round(exp(stats::coef(model)[var]), digits)
}

z <- function(model, var, digits = 2) {
    s <- summary(model)
    coef_table <- s$coefficients
    stat_col <- grep("z value|t value|z|t", colnames(coef_table), value = TRUE)
    if (length(stat_col) > 0) round(coef_table[var, stat_col[1]], digits)
    else if (ncol(coef_table) >= 3) round(coef_table[var, 3], digits)
    else NA
}

ci95 <- function(model, var, digits = 2, exp = FALSE) {
    ci <- stats::confint(model, var, level = 0.95)
    if (exp) ci <- base::exp(ci)
    paste0("[", round(ci[1], digits), ", ", round(ci[2], digits), "]")
}

bp <- function(model, var, b_digits = 2, p_digits = 3) {
    paste0("b = ", b(model, var, b_digits), ", p ", p(model, var, p_digits))
}

orp <- function(model, var, or_digits = 2, p_digits = 3) {
    paste0("OR = ", or(model, var, or_digits), ", p ", p(model, var, p_digits))
}

# --- Utility functions ---
add_comma <- function(x, ...) format(x, ..., big.mark = ",", scientific = FALSE, trim = TRUE)
number_to_word <- function(x) {
    words <- c("one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten")
    ifelse(x > 10 | x < 1, as.character(x), words[x])
}
round1 <- function(x) round(as.numeric(x), 1)
round2 <- function(x) round(as.numeric(x), 2)
na_to_dash <- function(x) { x <- as.character(x); ifelse(is.na(x) | x == "NA", "-", x) }
na_to_blank <- function(x) { x <- as.character(x); ifelse(is.na(x) | x == "NA", "", x) }
`%nin%` <- function(x, table) !x %in% table
pval <- function(x) dplyr::case_when(x < 0.001 ~ "$p < 0.001$", x < 0.01 ~ "$p < 0.01$", x < 0.05 ~ "$p < 0.05$", TRUE ~ "$p > 0.05$")
format_exp <- function(model, coef = 2, digits = 1) formatC((exp(stats::coef(model)[coef]) - 1) * 100, format = "f", digits = digits)

# --- Format detection ---
get_star_format <- function() dplyr::case_when(knitr::is_latex_output() ~ "latex", knitr::is_html_output() ~ "html", TRUE ~ "text")
get_kable_format <- function() dplyr::case_when(knitr::is_latex_output() ~ "latex", knitr::is_html_output() ~ "html", TRUE ~ "markdown")
get_xtable_format <- function() dplyr::case_when(knitr::is_latex_output() ~ "latex", TRUE ~ "html")

# --- Label conversion ---
.scholr_env <- new.env()
set_label_mappings <- function(mappings, append = TRUE) {
    if (append && exists("custom_mappings", envir = .scholr_env)) {
        mappings <- c(mappings, get("custom_mappings", envir = .scholr_env))
    }
    assign("custom_mappings", mappings, envir = .scholr_env)
    invisible(mappings)
}
get_label_mappings <- function() {
    if (exists("custom_mappings", envir = .scholr_env)) get("custom_mappings", envir = .scholr_env) else NULL
}
clear_label_mappings <- function() {
    if (exists("custom_mappings", envir = .scholr_env)) rm("custom_mappings", envir = .scholr_env)
    invisible(NULL)
}

.norm_strip_wave <- function(x) {
    x <- stringr::str_replace(x, "(?:_(16|20|24))(?=_(fct|bin|int|ihs|z)\\b)", "")
    x <- stringr::str_replace(x, "(?:_(16|20|24))$", "")
    stringr::str_replace(x, "(?<=\\D)(16|20|24)$", "")
}

convert_labels <- function(model, extracted = FALSE, use_defaults = TRUE) {
    labs <- if (extracted) model else broom::tidy(model)$term
    labs_orig <- labs
    labs_norm <- .norm_strip_wave(labs)
    labs2 <- labs
    custom <- get_label_mappings()
    already_labeled <- rep(FALSE, length(labs))
    if (!is.null(custom) && length(custom) > 0) {
        for (i in seq_along(custom)) {
            pattern <- names(custom)[i]
            replacement <- custom[i]
            matches <- (stringr::str_detect(labs, pattern) | stringr::str_detect(labs_norm, pattern)) & !already_labeled
            if (any(matches)) { labs2[matches] <- replacement; already_labeled[matches] <- TRUE }
        }
    }
    if (use_defaults) labs2 <- .apply_default_mappings(labs, labs2, labs_norm)
    labs2
}

.apply_default_mappings <- function(labs, labs2, labs_norm) {
    unchanged <- labs2 == labs
    dplyr::case_when(
        !unchanged ~ labs2,
        labs == "(Intercept)" ~ "(Intercept)",
        stringr::str_detect(labs_norm, "^age$") ~ "Age",
        stringr::str_detect(labs_norm, "^female") ~ "Female",
        stringr::str_detect(labs_norm, "^male") ~ "Male",
        stringr::str_detect(labs_norm, "^educ|^education") ~ "Education",
        stringr::str_detect(labs_norm, "^income") ~ "Income",
        stringr::str_detect(labs_norm, "^married") ~ "Married",
        stringr::str_detect(labs_norm, "^employed") ~ "Employed",
        stringr::str_detect(labs_norm, "race.*[bB]lack") ~ "Race: Black",
        stringr::str_detect(labs_norm, "race.*[hH]ispanic") ~ "Race: Hispanic",
        stringr::str_detect(labs_norm, "race.*[wW]hite") ~ "Race: White",
        stringr::str_detect(labs_norm, "race.*[oO]ther") ~ "Race: Other",
        stringr::str_detect(labs_norm, "^pid7") ~ "Party ID (7-point)",
        stringr::str_detect(labs_norm, "^pid3") ~ "Party ID (3-cat)",
        stringr::str_detect(labs_norm, "^ideo7") ~ "Ideology (7-point)",
        stringr::str_detect(labs, ":") ~ labs,
        TRUE ~ labs
    )
}

# --- Stargazer helpers ---
star_cut_vector <- c(0.05, NA, NA)
star0 <- function(..., type = NULL, digits = 3, star.cutoffs = star_cut_vector) {
    if (is.null(type)) type <- get_star_format()
    stargazer::stargazer(..., digits = digits, header = FALSE, type = type, align = TRUE,
                         font.size = "scriptsize", star.cutoffs = star.cutoffs,
                         notes.append = FALSE, notes = "*$p<0.05$")
}

star_var <- function(..., omit = NULL) {
    stargazer_output <- utils::capture.output(stargazer::stargazer(..., type = "text", omit = omit))
    drop_idx <- which(stringr::str_detect(stargazer_output, "^Constant"))
    if (length(drop_idx) > 0) stargazer_output <- stargazer_output[1:(drop_idx[1] - 1)]
    variable_lines <- grep("^[[:alpha:]]", stargazer_output, value = TRUE)
    variable_names <- vapply(variable_lines, function(line) strsplit(line, "  +")[[1]][1], character(1))
    convert_labels(unname(variable_names), extracted = TRUE)
}

# --- TeXcount ---
tc_count <- function(file = NULL, include_bib = TRUE, include_headers = FALSE) {
    if (is.null(file)) {
        if (requireNamespace("knitr", quietly = TRUE) && !is.null(knitr::current_input())) {
            file <- sub("\\.[Rr]md$", ".tex", knitr::current_input())
        } else stop("No file specified and cannot detect current input file.")
    }
    if (!file.exists(file)) {
        warning("TeX file not found: ", file)
        return(list(text = NA, headers = NA, outside = NA, total = NA,
                    total_formatted = "[compile twice for word count]", raw = NULL))
    }
    tc_check <- suppressWarnings(system("which texcount", intern = TRUE, ignore.stderr = TRUE))
    if (length(tc_check) == 0 || !nzchar(tc_check)) {
        warning("texcount not found")
        return(list(text = NA, headers = NA, outside = NA, total = NA,
                    total_formatted = "[texcount not installed]", raw = NULL))
    }
    cmd <- paste(if (include_bib) "texcount -inc -incbib -total -sum" else "texcount -inc -total -sum", shQuote(file))
    tc_out <- system(cmd, intern = TRUE, ignore.stderr = TRUE)
    extract_count <- function(pattern) {
        line <- grep(pattern, tc_out, value = TRUE)
        if (length(line) == 0) return(NA)
        as.numeric(sub(".*?(\\d+).*", "\\1", line[1]))
    }
    text_count <- extract_count("Words in text")
    header_count <- extract_count("Words in headers")
    outside_count <- extract_count("Words outside text")
    total <- sum(c(text_count, outside_count), na.rm = TRUE)
    if (include_headers) total <- sum(c(total, header_count), na.rm = TRUE)
    list(text = text_count, headers = header_count, outside = outside_count, total = total,
         total_formatted = format(total, big.mark = ",", scientific = FALSE), raw = tc_out)
}

tc_words <- function(file = NULL, include_bib = TRUE, include_headers = FALSE) {
    tc_count(file = file, include_bib = include_bib, include_headers = include_headers)$total_formatted
}

# =============================================================================
# End of local scholr functions
# =============================================================================
```


```{r wordcount, include=FALSE}
# Word count using texcount (counts previous render's .tex file)
# Explicit path to handle working directory issues during knit
tex_file <- here::here("text_docs", "text_as_behavior9_renamed.tex")

if (file.exists(tex_file)) {
  word_count <- tc_words(file = tex_file)
} else {
  word_count <- "[compile twice for word count]"
}
```

\begin{small}
\end{small}


```{r setup_cache, include = FALSE}

cache_lgl <- FALSE
knitr::opts_chunk$set(cache.rebuild = TRUE)

```


```{r setup, include = FALSE}
library(knitr)
library(here)
opts_chunk$set(
  fig.lp  = "fig:",
  echo    = FALSE,
  message = FALSE,
  warning = FALSE,
  error   = FALSE,
  dev     = c("cairo_pdf", "png"),
  dpi     = 288

)


star_cut_vector <- c(0.05, NA, NA)

```


```{r load_packages_data_functions, include = FALSE}

# load packages
source(here::here("text_code", "text_packages_rep.R"), echo = FALSE)
source(here::here("text_code", "text_functions_rep.R"), echo = FALSE)

# load_data

# ANES

# load processed data

# 2016
load(file = here("text_data_output", "anes2016_processed.Rdata"), verbose = TRUE)

# 2020
load(file = here("text_data_output", "anes2020_merged.Rdata"), verbose = TRUE)

# 2024
load(file = here("text_data_output", "anes2024_merged.Rdata"), verbose = TRUE)

# AAP
load(file = here("text_data_output", "aap_processed.Rdata"),  verbose = TRUE)


# load other functions for running & plotting multiple models
#source(here("text_code/anes_functions.R"), echo = FALSE)
source(here("text_code/custom_ggplot_themes.R"))
source(here("text_code/custom_table_functions_rep.R"))

```

# Introduction {#sec:intro}


# Related Work {#sec:related-work}

# Text as Behavior {#sec:text-as-behavior}

# Data and Methods {#sec:data-methods}

\begin{table}[ht]
\begin{footnotesize}

\begin{tabular}[t]{@{}lL{1.25in}L{1.25in}L{2in}@{}}
\end{tabular}
\label{tab:overview}
\end{footnotesize}
\end{table}


## American National Election Study

## Afrobarometer {#sec:afrobarometer}

## Defining Measures {#sec:defining-measures}

\begin{footnotesize}
\begin{equation} T(s) = \frac{\operatorname{asinh}(\mathrm{nchar}(s))}{\max_{\text{mode}} \operatorname{asinh}(\mathrm{nchar}(s))} \label{eq:tfunction}
\end{equation}
\end{footnotesize}

# Study 1: Text as Predictor of Candidate Choice {#sec:study1}

### Study 1a: Informative Nonresponse on Candidate Choice {#sec:study1a}


```{r models_nonresponse_lddr_lrdd, include = FALSE, cache = cache_lgl}
## ---- models_nonresponse_lddr_lrdd ----
mod_nonresp16_fct <- list()

## nonresp_lddr16
# mm <- model4(
#     df = anes  %>% 
#         mutate(
#             nonresp_lddr16 = factor(nonresp_lddr16), 
#             nonresp_lrdd16 = factor(nonresp_lrdd16)
#         ), 
#     y = "vote_rep16", 
#     x = c("nonresp_lddr16", "nonresp_lrdd16"), 
#     controls = "+ racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + party + income16", 
#     type = "bin"
# )


mod_nonresp16_fct[[1]] <- glm(
    formula = vote_dem16 ~ nonresp_lddr16_fct * pid4_16 + ideo7_16  + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)


mod_nonresp16_fct[[2]] <- glm(
    formula = vote_rep16 ~ nonresp_lrdd16_fct * pid4_16 + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)

mod_nonresp16_fct[[3]] <- glm(
    formula = vote_dem16 ~ nonresp_lddr16_fct * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)


mod_nonresp16_fct[[4]] <- glm(
    formula = vote_rep16 ~ nonresp_lrdd16_fct * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)


nonresp_cont <- list()

nonresp_cont[[1]] <- glm(
    formula = vote_dem16 ~ nonresp_lddr16 * pid4_16 + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)


nonresp_cont[[2]] <- glm(
    formula = vote_rep16 ~ nonresp_lrdd16 * pid4_16 + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)

nonresp_cont[[3]] <- glm(
    formula = vote_dem16 ~ nonresp_lddr16 * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)


nonresp_cont[[4]] <- glm(
    formula = vote_rep16 ~ nonresp_lrdd16 * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = a16)

# ", "nonresp_lrdd16"),
```


```{r tables-nonresponse-lddr-lrdd-run, include = FALSE, cache = cache_lgl}
## ---- tables-nonresponse-lddr-lrdd, results = "asis"


cov_labels <- star_var(mod_nonresp16_fct, omit = c("pid.*Oth", "race16.*"))
# cov_label1 <- convert_labels(mod_nonresp16_fct[[3]])
# cov_label2 <- convert_labels(mod_nonresp16_fct[[4]])
#len <- length(cov_labels)


# cov_labels <- c(cov_label1[1:2], cov_label2[1:2], cov_label1[3:22], cov_label2[19:22] 
#                 )

# cov_labels <- cov_labels[!str_detect(cov_labels, "Race|Party: Other|x Other")]

star_ft(mod_nonresp16_fct,
        covariate.labels = cov_labels,
        dep.var.labels = rep(c("Clinton 2016", "Trump 2016"), 2),
        no.space = TRUE,
        title = "Candidate Choice vs Nonresponse (Discrete)",
        label = "tab:tables-nonresponse-lddr-lrdd",
        omit = c("pid.*Oth", "race16.*"),
        notes = c("Race variable included in models but omitted above for space.", "*$p<0.05$"),
        add.lines = list(
            c("Feeling Thermometer?", rep("\\text{No}", 2), rep("\\text{Yes}", 2))
           )
        )


# cov_labels <- star_var(nonresp_cont)

# star_ft(nonresp_cont,
#         covariate.labels = cov_labels,
#         dep.var.labels = rep(c("Vote Clinton", "Vote Trump"), 2),
#         no.space = TRUE,
#         title = "Self-reported Vote Choice vs Nonresponse (Continuous) \\label{tab:tables-nonresponse-continuous-lddr-lrdd}",
#         omit = "pid4Other",
#         add.lines = list(
#             c("Feeling Thermometer?", rep("\\text{No}", 2), rep("\\text{Yes}", 2))
#         )
# )
```


```{r figure-1-plots-nonresponse-lddr-lrdd, fig.cap = "Marginal effects of nonresponse on probability of selecting candidate in 2016 (see Table\ \\ref{tab:tables-nonresponse-lddr-lrdd}).", fig.height = 3.5, fig.width = 8, cache = cache_lgl}
## ---- figure-1-plots-nonresponse-lddr-lrdd, fig.height = 4 ----

pp <- list()

pp[[1]] <- plot_model(mod_nonresp16_fct[[3]], type = "eff", terms = c("nonresp_lddr16_fct", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), dodge = .3) + 
    aes(color = group, shape = group, group = group) +
    geom_line(linetype = "dotted", position = position_dodge(width = .3)) + 
    scale_x_continuous(breaks = c(0:2), 
                       #labels = c("0\nClinton+", "1", "2\nClinton-") 
                       labels = 0:2
                       ) +
    scale_y_continuous(
        limits = c(0,.46), 
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0,.4,.2)
        ) + 
    labs(title = element_blank(),
         #x     = "# Nonresponses\n(For Clinton + Against Trump)",
         x     = "# Informative Nonresponses\n(For Clinton + Against Trump)",
         y     = "Pr(Vote Clinton)",
         color = "Party ID",
         shape = "Party ID",
         linetype = "Party ID") + 
    theme_book() 

#scale_color_manual(values = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), name = "Party ID") +

pp[[2]] <- plot_model(mod_nonresp16_fct[[4]], type = "eff", terms = c("nonresp_lrdd16_fct", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), dodge = .25) + 
    aes(color = group, shape = group, group = group) + 
    geom_line(linetype = "dotted", position = position_dodge(width = .3)) + 
    scale_x_continuous(breaks = c(0:2), 
                       #labels = c("0\nTrump+", "1", "2\nTrump-") 
                       labels = 0:2
                       ) +
    scale_y_continuous(
        limits = c(0,.46), 
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0,.4,.2)
        ) + 
    labs(title = element_blank(),
         #x = "# Nonresponses\n(For Trump + Against Clinton)",
         x = "# Informative Nonresponses\n(For Trump + Against Clinton)",
         y = "Pr(Vote Trump)",
         color    = "Party ID",
         shape    = "Party ID",
         linetype = "Party ID") + 
    #scale_color_manual(values = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), name = "Party ID") +
    theme_book() 


# Add a little right-side space so labels don’t get clipped
# pp[[1]] <- pp[[1]] +
#   scale_x_continuous(breaks = 0:2, expand = expansion(mult = c(.02, .12)))
# pp[[2]] <- pp[[2]] +
#   scale_x_continuous(breaks = 0:2, expand = expansion(mult = c(.02, .12)))

# pp[[1]] <- add_direct_labels(pp[[1]], nudge_x = 0.25)
# pp[[2]] <- add_direct_labels(pp[[2]], nudge_x = 0.25)


#cowplot::plot_grid(plotlist = pp, ncol = 2)

# Panel A
pp[[1]] <- pp[[1]] +
  annotate("text", x = 1.5, y = 0.26, label = "Dem",
           color = "#013388", fontface = "plain", size = 3) +
  annotate("text", x = 1.5, y = 0.15, label = "Ind",
           color = "mediumpurple", fontface = "plain", size = 3) +
  annotate("text", x = 1.5, y = 0.05, label = "Rep",
           color = "#cc0000", fontface = "plain", size = 3) +
    theme(
        #text = element_text(family = "Verdana")
        axis.title.y = element_text(margin = margin(r = 10), angle = 90, vjust = 0.5)
    )


# Panel B
pp[[2]] <- pp[[2]] +
  annotate("text", x = 1.5, y = 0.03, label = "Dem",
           color = "#013388", fontface = "plain", size = 3) +
  annotate("text", x = 1.5, y = 0.11, label = "Ind",
           color = "mediumpurple", fontface = "plain", size = 3) +
  annotate("text", x = 1.5, y = 0.24, label = "Rep",
           color = "#cc0000", fontface = "plain", size = 3) +
    theme(
        #text = element_text(family = "Verdana"),
         axis.title.y = element_text(margin = margin(r = 10), angle = 90, vjust = 0.5)
    )


pp[[1]] + pp[[2]] +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A") 


```

\begin{footnotesize}
\begin{eqnarray}
\end{eqnarray}
\end{footnotesize}


```{r nonresponse-prose, include = FALSE}
ggeffects::ggpredict(mod_nonresp16_fct[[3]], terms = c("nonresp_lddr16_fct", "pid4_16 [Dem, Ind, Rep]"))
ggeffects::ggpredict(mod_nonresp16_fct[[4]], terms = c("nonresp_lrdd16_fct", "pid4_16 [Dem, Ind, Rep]"))

```


```{r figure-A1-1-models20-nonresponse-all-run, include = FALSE, cache = cache_lgl}

## ---- models_nonresponse_lddr_lrdd ----

mod_nonresp20_fct <- list()


mod_nonresp20_fct[[1]] <- a20 %>% 
    glm(formula = vote_biden20_bin ~ nonresp_all16 * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = .)


mod_nonresp20_fct[[2]] <- a20 %>% glm(
    formula = vote_trump20_bin ~ nonresp_all16 * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = .)


pp20 <- list()

pp20[[1]] <- plot_model(mod_nonresp20_fct[[1]], type = "pred", terms = c("nonresp_all16", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), dodge = .3) + 
    aes(color = group, shape = group, group = group, linetype = group) +
    #geom_line(color = "gray50", linetype = "dotted", position = position_dodge(width = .3)) + 
   scale_x_continuous(
     breaks = seq(-2,2,1), 
     labels = c("-2\nClinton+", "-1", "0", "1", "2\nTrump+")) +
    scale_y_continuous(
        limits = c(0,1), 
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0,1,.25)) + 
    labs(title = element_blank(),
         x = "Nonresponse Scale 2016",
         y = "Pr(Vote Biden 2020)",
         color = "Party ID",
         shape = "Party ID",
         linetype = "Party ID") + 
    theme_book() 

#scale_color_manual(values = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), name = "Party ID") +

pp20[[2]] <- plot_model(mod_nonresp20_fct[[2]], type = "pred", terms = c("nonresp_all16", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), dodge = .25) + 
    aes(color = group, shape = group, group = group, linetype = group) + 
    #geom_line(color = "gray50", linetype = "dotted", position = position_dodge(width = .3)) + 
scale_x_continuous(
    breaks = seq(-2,2,1), 
    labels = c("-2\nClinton+", "-1", "0", "1", "2\nTrump+")) +
    scale_y_continuous(
        limits = c(0,1), 
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0,1,.25)) + 
    labs(title = element_blank(),
         x = "Nonresponse Scale 2016",
         y = "Pr(Vote Trump 2020)",
         color = "Party ID",
         shape = "Party ID",
         linetype = "Party ID") + 
    #scale_color_manual(values = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), name = "Party ID") +
    theme_book() 


#cowplot::plot_grid(plotlist = pp, ncol = 2)


pp20[[1]] + pp20[[2]] +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")


```


```{r figure-A1-2-models24-nonresponse-all-run, include = FALSE, cache = cache_lgl}

## ---- models_nonresponse_lddr_lrdd ----

mod_nonresp24_fct <- list()


mod_nonresp24_fct[[1]] <- a24 %>% 
    glm(formula = vote_harris24_bin ~ nonresp_all16 * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = .)


mod_nonresp24_fct[[2]] <- a24 %>% glm(
    formula = vote_trump24_bin ~ nonresp_all16 * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,
    family = binomial,
    data = .)


pp24 <- list()

pp24[[1]] <- plot_model(mod_nonresp24_fct[[1]], type = "pred", terms = c("nonresp_all16", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), dodge = .3) + 
    aes(color = group, shape = group, group = group, linetype = group) +
    #geom_line(color = "gray50", linetype = "dotted", position = position_dodge(width = .3)) + 
   scale_x_continuous(
     breaks = seq(-2,2,1), 
     labels = c("-2\nClinton+", "-1", "0", "1", "2\nTrump+")) +
    scale_y_continuous(
        limits = c(0,1), 
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0,1,.25)) + 
    labs(title = element_blank(),
         x = "Nonresponse Scale 2016",
         y = "Pr(Vote Harris 2024)",
         color = "Party ID",
         shape = "Party ID",
         linetype = "Party ID") + 
    theme_book() 

#scale_color_manual(values = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), name = "Party ID") +

pp24[[2]] <- plot_model(mod_nonresp24_fct[[2]], type = "pred", terms = c("nonresp_all16", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), dodge = .25) + 
    aes(color = group, shape = group, group = group, linetype = group) + 
    #geom_line(color = "gray50", linetype = "dotted", position = position_dodge(width = .3)) + 
scale_x_continuous(
    breaks = seq(-2,2,1), 
    labels = c("-2\nClinton+", "-1", "0", "1", "2\nTrump+")) +
    scale_y_continuous(
        limits = c(0,1), 
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0,1,.25)) + 
    labs(title = element_blank(),
         x = "Nonresponse Scale 2016",
         y = "Pr(Vote Trump 2024)",
         color = "Party ID",
         shape = "Party ID",
         linetype = "Party ID") + 
    #scale_color_manual(values = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca"), name = "Party ID") +
    theme_book() 


#cowplot::plot_grid(plotlist = pp, ncol = 2)

pp24[[1]] + pp24[[2]] +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")


```


```{r include = FALSE}
ggpredict(mod_nonresp24_fct[[1]], terms = c("nonresp_all16", "pid4_16[Dem, Ind, Rep]"))

ggpredict(mod_nonresp24_fct[[2]], terms = c("nonresp_all16", "pid4_16[Dem, Ind, Rep]"))

```

\FloatBarrier

### Study 1b: Expressive Alignment on Candidate Choice {#sec:study1b}

\begin{footnotesize}
\begin{eqnarray}
  \label{eq:alignment}
\end{eqnarray}
\end{footnotesize}


```{r alignment-correlations, include = FALSE}
# correlation btwn Expressive Alignment and feeling thermometers
cor(a16$nchar_align_ihs, a16$ft_clinton, use = "pairwise.complete.obs")
cor(a16$nchar_align_ihs, a16$ft_trump,   use = "pairwise.complete.obs")
cor(a16$nchar_align_ihs, a16$ft_trump_clinton,   use = "pairwise.complete.obs")
cor(a16$nchar_align_ihs, a16$ideo7_16, use = "pairwise.complete.obs")
cor(a16$nchar_align_ihs, a16$pid7_int,   use = "pairwise.complete.obs")

```


```{r figure-2-plots-vote-nonresp-nchar-all-ihs, fig.height = 3.5, fig.width = 8, fig.cap = "Marginal effects of Expressive Alignment in 2016 on candidate choice in 2016, by party identification (see Table\ \\ref{tab:table-vote-nonresp-nchar-all}).", cache = cache_lgl}

m_dem_comb_int_ihs <- glm(vote_dem16 ~ nchar_align_ihs * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = a16, family = binomial)

m_dem_comb_no_int_ihs <- glm(vote_dem16 ~ nchar_align_ihs + pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = a16, family = binomial)

m_dem_comb_reduced <- glm(vote_dem16 ~ pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = a16, family = binomial)

m_rep_comb_int_ihs <- glm(vote_rep16 ~ nchar_align_ihs * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = a16, family = binomial)

m_rep_comb_no_int_ihs <- glm(vote_rep16 ~ nchar_align_ihs + pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = a16, family = binomial)

m_rep_comb_reduced <- glm(vote_rep16 ~ pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = a16, family = binomial)


p13_ihs <-
    plot_model(model = m_dem_comb_int_ihs, terms = c("nchar_align_ihs [all]", "pid4_16 [Dem, Ind, Rep]"), type = "eff",
               colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Unreg = "#cbcaca")) +
    aes(linetype = group, color = group) +
    labs(title = element_blank(), linetype = "Party ID", color = "Party ID",
         x = "Expressive Alignment 2016", y = "Pr(Vote Clinton 2016)") +
    scale_x_continuous(
        limits = c(-1, 1),
        breaks = seq(-1, 1, 1),
        labels = c("-1\nClinton+", "0", "1\nTrump+")
    ) +
    scale_y_continuous(limits = c(0, 0.85),
                       labels = scales::percent_format(accuracy = 1)) +
    theme_book()

p14_ihs <-
    plot_model(model = m_rep_comb_int_ihs, terms = c("nchar_align_ihs [all]", "pid4_16 [Dem, Ind, Rep]"), type = "eff",
               colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Unreg = "#cbcaca")) +
    aes(linetype = group, color = group) +
    labs(title = element_blank(), linetype = "Party ID", color = "Party ID",
         x = "Expressive Alignment 2016", y = "Pr(Vote Trump 2016)") +
    scale_x_continuous(
        limits = c(-1, 1),
        breaks = seq(-1, 1, 1),
        labels = c("-1\nClinton+", "0", "1\nTrump+")
    ) +
    scale_y_continuous(limits = c(0, 0.85),
                       labels = scales::percent_format(accuracy = 1)) +
    theme_book()


# Panel A
p13_ihs <- p13_ihs +
  annotate("text", x = 0.25, y = 0.42, label = "Dem",
           color = "#013388", fontface = "plain", size = 3) +
  annotate("text", x = 0.25, y = 0.17, label = "Ind",
           color = "mediumpurple", fontface = "plain", size = 3) +
  annotate("text", x = 0.25, y = 0.00, label = "Rep",
           color = "#cc0000", fontface = "plain", size = 3)


# Panel B
p14_ihs <- p14_ihs +
  annotate("text", x = 0, y = 0.04, label = "Dem",
           color = "#013388", fontface = "plain", size = 3) +
  annotate("text", x = -0.5, y = 0.135, label = "Ind",
           color = "mediumpurple", fontface = "plain", size = 3) +
  annotate("text", x = 0, y = 0.37, label = "Rep",
           color = "#cc0000", fontface = "plain", size = 3)


(p13_ihs + p14_ihs) +
    plot_layout(guides = "collect", ) +
    plot_annotation(tag_levels = "A") #& theme(legend.position = "bottom")

```


```{r include = FALSE}
ggeffects::ggpredict(m_dem_comb_int_ihs, terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"))
ggeffects::ggpredict(m_rep_comb_int_ihs, terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"))
```


```{r table-vote-nonresp-nchar-all-run, results = 'hide', include = FALSE, eval = TRUE}

# cov_labels1 <- convert_labels(m_dem_comb_int)
# cov_labels2 <- convert_labels(m_rep_comb_int)


models <- list(m_dem_comb_int_ihs, m_rep_comb_int_ihs)

cov_labels <- star_var(models, omit = "pid.*Oth")

star_ft(models,
    covariate.labels = cov_labels,
    no.space         = TRUE,
    model.names      = TRUE,
    omit             = "pid.*Oth",
    dep.var.labels = c("Vote Clinton 2016", "Vote Trump 2016"),
    title = "Candidate Choice in 2016 vs Expressive Alignment 2016",
    label = "tab:table-vote-nonresp-nchar-all"#,
 #out = "../text_docs/tables/clinton-trump-2016-table.tex"

)


```


```{r anes2020-cand-pref-models, include = FALSE}
## ---- biden, results = 'asis' ----

## 2020 vote choice Biden 
biden20_out <- a20 %>% 
    filter(reg16_bin == 1) %>% # registered voters
    mutate(pid4_16 = relevel(pid4_16 %>% as.factor(), ref = "Ind")) %>%
#    mutate(ft_biden = ifelse(V201151 >= 0, V201151, NA_real_)) %>%
    glm(vote_biden20_bin ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = ., family = binomial) 

## 2020 vote choice Trump 
trump20_out <- a20 %>% 
    filter(reg16_bin == 1) %>% # registered voters
    mutate(pid4_16 = relevel(pid4_16 %>% as.factor(), ref = "Ind")) %>%
    glm(vote_trump20_bin ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = ., family = binomial) 


```


```{r anes2020-cand-pref-plots, include = FALSE}

# biden_out <- a20 %>% 
# #    filter(V201008 == 1 | V201008 == 2) %>% # registered voters
#     glm(vote_biden20_bin ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = ., family = binomial) %>% 
biden20_plot <- biden20_out %>% 
    plot_model(type = "pred", terms = c("nchar_align_ihs [all]", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca")) +
    aes(linetype = group, color = group) +
    #scale_x_continuous(limits = c(0, 2150), breaks = seq(0, 2150, 500)) +
    scale_y_continuous(
        limits = c(0, 1),
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0, 1, .25)
    ) +
    scale_x_continuous(
        breaks = c(-1, 0, 1),
        labels = c( "-1\n Clinton+", "0", "1\n Trump+") #"-2\n         Clinton+", , "2\nTrump+          "
    ) +
    labs(x = "Expressive Alignment 2016",
         y = "Pr(Vote Biden 2020)",
         title = element_blank(),
         color = "Party ID",
         linetype = "Party ID"
         #title = "Turnout vs Text"#,
         #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
    ) +
    theme_book()


# trump_out <- a20 %>% 
#     filter(reg16_bin == 1) %>% # registered voters
#     mutate(pid4_16 = relevel(pid4_16 %>% as.factor(), ref = "Ind")) %>%
#     glm(vote_trump20_bin ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = ., family = binomial) %>% 
trump20_plot <- trump20_out %>% 
    plot_model(type = "pred", terms = c("nchar_align_ihs [all]", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca")) +
    aes(linetype = group, color = group) +
    #scale_x_continuous(limits = c(0, 2150), breaks = seq(0, 2150, 500)) +
    scale_y_continuous(
        limits = c(0, 1),
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0, 1, .25)
    ) +
    scale_x_continuous(
        breaks = c(-1, 0, 1),
        labels = c( "-1\n Clinton+", "0", "1\n Trump+") #"-2\n         Clinton+", , "2\nTrump+          "
    ) +
    labs(x        = "Expressive Alignment 2016",
         y        = "Pr(Vote Trump 2020)",
         title    = element_blank(),
         color    = "Party ID",
         linetype = "Party ID"
         #title = "Turnout vs Text"#,
         #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
    ) +
    theme_book()

```


```{r biden-trump-2020-table-run, results = 'hide', include = FALSE}

## ---- biden-trump-2020-table, results = 'asis' ----

models <- list(biden20_out, trump20_out)

cov_labels <- star_var(models, omit = "pid.*Oth")

star_ft(
    models,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    dep.var.labels   = c("Vote Biden 2020", "Vote Trump 2020"),
    no.space         = TRUE,
    title = "Candidate Choice in 2020 vs Expressive Alignment 2016",
    label = "tab:biden-trump-2020-table"#,
    #out = "../text_docs/tables/biden-trump-2020-table.tex"

)

```


```{r figure-A1-4-biden-trump-vote-comb-plot-run, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on candidate preference in 2020, by party identification (see Table\ \\ref{tab:biden-trump-2020-table}).", include = FALSE, cache = cache_lgl}


## ---- figure-A1-4-biden-trump-vote-comb-plot, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png") ----

biden20_plot + trump20_plot +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")

```


```{r include = FALSE}
ggeffects::ggpredict(biden20_out, terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"))
ggeffects::ggpredict(trump20_out, terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"))

```


```{r anes2024-cand-pref-models, include = FALSE}
## ---- biden, results = 'asis' ----

## 2024 vote choice Harris 
harris24_out <- a24 %>% 
    filter(reg16_bin == 1) %>% # registered voters
    mutate(pid4_16 = relevel(pid4_16 %>% as.factor(), ref = "Ind")) %>%
#    mutate(ft_biden = ifelse(V201151 >= 0, V201151, NA_real_)) %>%
    glm(vote_harris24_bin ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = ., family = binomial) 

## 2024 vote choice Trump 
trump24_out <- a24 %>% 
    filter(reg16_bin == 1) %>% # registered voters
    mutate(pid4_16 = relevel(pid4_16 %>% as.factor(), ref = "Ind")) %>%
    glm(vote_trump24_bin ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + ideo7_16 + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, data = ., family = binomial) 


```


```{r anes2024-cand-pref-plots, include = FALSE}

harris24_plot <- harris24_out %>% 
    plot_model(type = "pred", terms = c("nchar_align_ihs [all]", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca")) +
    aes(linetype = group, color = group) +
    #scale_x_continuous(limits = c(0, 2150), breaks = seq(0, 2150, 500)) +
    scale_y_continuous(
        limits = c(0, 1),
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0, 1, .25)
    ) +
    scale_x_continuous(
        breaks = c(-1, 0, 1),
        labels = c( "-1\n Clinton+", "0", "1\n Trump+") #"-2\n         Clinton+", , "2\nTrump+          "
    ) +
    labs(x = "Expressive Alignment 2016",
         y = "Pr(Vote Harris 2024)",
         title = element_blank(),
         color = "Party ID",
         linetype = "Party ID"
         #title = "Turnout vs Text"#,
         #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
    ) +
    theme_book()


trump24_plot <- trump24_out %>% 
    plot_model(type = "pred", terms = c("nchar_align_ihs [all]", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca")) +
    aes(linetype = group, color = group) +
    #scale_x_continuous(limits = c(0, 2150), breaks = seq(0, 2150, 500)) +
    scale_y_continuous(
        limits = c(0, 1),
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(0, 1, .25)
    ) +
    scale_x_continuous(
        breaks = c(-1, 0, 1),
        labels = c( "-1\n Clinton+", "0", "1\n Trump+") #"-2\n         Clinton+", , "2\nTrump+          "
    ) +
    labs(x        = "Expressive Alignment 2016",
         y        = "Pr(Vote Trump 2024)",
         title    = element_blank(),
         color    = "Party ID",
         linetype = "Party ID"
         #title = "Turnout vs Text"#,
         #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
    ) +
    theme_book()

```


```{r harris-trump-2024-table-run, results = 'hide', include = FALSE}

## ---- biden-trump-2020-table, results = 'asis' ----

models <- list(harris24_out, trump24_out)

cov_labels <- star_var(models, omit = "pid.*Oth")

star_ft(
    models,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    dep.var.labels   = c("Vote Harris 2024", "Vote Trump 2024"),
    no.space         = TRUE,
    title = "Candidate Choice in 2024 vs Expressive Alignment 2016",
    label = "tab:harris-trump-2024-table"#,
    #out = "../text_docs/tables/harris-trump-2024-table.tex"

)

```


```{r figure-A1-5-harris-trump-vote-comb-plot-run, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on candidate preference in 2024, by party identification (see Table\ \\ref{tab:harris-trump-2024-table}).", include = FALSE, cache = cache_lgl}


## ---- figure-A1-4-biden-trump-vote-comb-plot, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png") ----

harris24_plot + trump24_plot +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")

```


```{r include = FALSE}
ggeffects::ggpredict(harris24_out, terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"))
ggeffects::ggpredict(trump24_out,  terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"))

```

\FloatBarrier

### Study 2: Expressive Alignment on Party Identification {#sec:study2}


```{r figure-3-switcher2-multinomial-plot, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on likelihood of maintaining the same party identification between 2016 and 2020 using a multinomial model (see Table\ \\ref{tab:switcher2-multinomial-table}).", cache = FALSE}
## ---- figure-3-switcher2-multinomial-plot ----

mout_all <- a20 %>%  
    filter(reg16_bin == 1) %>% # registered voters
    #filter(switcher_fct == "Dem16->Dem20" | switcher_fct == "Ind16->Ind20" | switcher_fct == "Rep16->Rep20") %>% 
    multinom(switcher2_20_fct ~ nchar_align_ihs +ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16, data = ., trace = FALSE, Hess = TRUE) 


preds <- predictions(
    mout_all,
    newdata   = datagrid(nchar_align_ihs = seq(-1, 1, length.out = 10)),
    variables = "nchar_align_ihs"#,
    #vcov = "robust"
)

# look at predicted estimates
# preds %>% select(group, nchar_align_ihs, estimate) %>% filter(!str_detect(group, "!"))

# REP
# 0.2614/0.0442
# DEM
# 0.3243/0.0770

custom_colors <- c(
    "Dem16 → Dem20" = "#013388",
    "Rep16 → Rep20" = "#cc0000",
    "Ind16 → Ind20" = "mediumpurple"
)

custom_linetypes <- c(
    "Dem16 → Dem20" = "solid",
    "Rep16 → Rep20" = "solid",
    "Ind16 → Ind20" = "solid"
)

# facet_labels <- c(
#     "Dem16 &rarr; Dem20"     = "Dem16 → Dem20<br><span style='font-size:9pt'>(Stay ≈ 29.8%, Switch ≈ 6.9%)</span>",
#     "Ind16 &rarr; Ind20" = "Ind16 → Ind20<br><span style='font-size:9pt'>(Stay ≈ 19.8%, Switch ≈ 5.9%)</span>",
#     "Rep16 &rarr; Rep20" = "Rep16 → Rep20<br><span style='font-size:9pt'>(Stay ≈ 24.5%, Switch ≈ 3.9%)</span>"
# )

facet_labels <- c(
    "Dem16 &rarr; Dem20"     = "Dem16 → Dem20<br><span style='font-size:9pt'>(Stay ≈ 30%, Switch ≈ 7%)</span>",
    "Ind16 &rarr; Ind20" = "Ind16 → Ind20<br><span style='font-size:9pt'>(Stay ≈ 20%, Switch ≈ 6%)</span>",
    "Rep16 &rarr; Rep20" = "Rep16 → Rep20<br><span style='font-size:9pt'>(Stay ≈ 24%, Switch ≈ 4%)</span>"
)


preds %>%
    filter(str_detect(group, "Dem16->Dem20|Ind16->Ind20|Rep16->Rep20")) %>%
    mutate(group = str_replace(group, "->", " → ")) %>%
    #filter(!str_detect(group, "!")) %>% 
    #filter(!str_detect(group, "Ind")) %>%
    ggplot() +
    aes(linetype = group, fill = group, color = group) +
    # geom_smooth(aes(x     = nonresp_nchar_all_log, y = estimate), method = "loess", se = FALSE) +
    geom_ribbon(
        aes(
            x     = nchar_align_ihs,
            ymin  = conf.low,
            ymax  = conf.high,
            group = group,
            color = group
        ),
         alpha = 0.2, color = NA
    ) +
    geom_smooth(
        aes(x = nchar_align_ihs, y = estimate, group = group, color = group),
        method    = "loess", 
        se        = FALSE, 
        linewidth = 0.75
    ) +
    
    # geom_line(
    #     aes(
    #         x     = nchar_align_ihs,
    #         y     = estimate,
    #         color = group,
    #         group = group
    #     ),
    #     linewidth = 0.5
    # ) +
    scale_color_manual(values = custom_colors) +
    scale_fill_manual(values  = custom_colors) +  # So ribbons  <-  line color
    #scale_linetype_manual(values = custom_linetypes) +
    facet_wrap(~group) +
    #facet_wrap(~ group, labeller = labeller(group = facet_labels)) +
    scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
    scale_x_continuous(
        breaks = c(-1, 0, 1),
        labels = c("-1\n         Clinton+", "0", "1\nTrump+        ")
    ) +
    labs(
        x = "Expressive Alignment 2016",
        y = element_blank() #"Predicted Probability"
    ) +
    #theme_minimal(base_size = 14) +
    theme_book() +
    theme(
        legend.position = "none",
        #strip.text.x = element_markdown(size = 10),
        #axis.text = element_text(size = 10),
        text = element_text(size = 13, 
                            family = "Verdana")
    )


```


```{r switcher2-multinomial-table-run, results = 'hide', include = FALSE, cache = cache_lgl}


models <- list(mout_all) 

cov_labels <- star_var(models, omit = "pid.*Oth")

# Capture the output as a character vector
stargazer_output <- capture.output(
    star_ft(mout_all,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    no.space         = TRUE,
    #model.names      = TRUE,
    #dep.var.labels   = c("P"),
    title = "Predicted probability of Party ID stability or switching between 2016 and 2020 vs Expressive Alignment in 2016 using multinomial model with Ind16 $\\rightarrow$ Ind20 as the reference category.",
    label = "tab:switcher2-multinomial-table" #,
    #out = "multinomial-switcher-table.tex"
)
)

# Replace -> with $\rightarrow$
stargazer_latex_fixed <- stargazer_output %>%
    str_replace_all("-\\\\textgreater ", " $\\\\rightarrow$ ") %>%
    str_replace_all("Dem", "D") %>%
    #str_replace_all("Ind", "I") %>%
    str_replace_all("Rep", "R")

# Print to console or write to file
cat(stargazer_latex_fixed, sep = "\n")
```


```{r figure-A1-6-model-party-switching-16-to-24-run, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on likelihood of maintaining same party identification between 2016 and 2024 using multinomial model (see Table\ \\ref{tab:table-party-switching-16-to-24}).", include = FALSE, cache = cache_lgl}
## ---- figure-A1-6-model-party-switching-16-to-24 ----

mout24 <- a24 %>%  
    filter(reg16_bin == 1) %>% # registered voters
    #filter(switcher_fct == "Dem16->Dem20" | switcher_fct == "Ind16->Ind20" | switcher_fct == "Rep16->Rep20") %>% 
    multinom(switcher24_fct ~ nchar_align_ihs +ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16, data = ., trace = FALSE, Hess = TRUE) 


## ---- plot-party-switching-16-to-24 ----

preds <- predictions(
    mout24,
    newdata   = datagrid(nchar_align_ihs = seq(-1, 1, length.out = 10)),
    variables = "nchar_align_ihs"
)


custom_colors <- c(
    "Dem16 → Dem24" = "#013388",
    "Rep16 → Rep24" = "#cc0000",
    "Ind16 → Ind24" = "mediumpurple"
)

custom_linetypes <- c(
    "Dem16 → Dem24" = "solid",
    "Rep16 → Rep24" = "solid",
    "Ind16 → Ind24" = "solid"
)

plot_party_switch_16_24 <- preds %>%
    filter(str_detect(group, "Dem16->Dem24|Ind16->Ind24|Rep16->Rep24")) %>%
    mutate(group = str_replace(group, "->", " → ")) %>%
    #filter(!str_detect(group, "!")) %>%
    #filter(!str_detect(group, "Ind")) %>%
    ggplot() +
    aes(linetype = group, fill = group, color = group) +
    # geom_smooth(aes(x     = nonresp_nchar_all_log, y = estimate), method = "loess", se = FALSE) +
    geom_ribbon(
        aes(
            x     = nchar_align_ihs,
            ymin  = conf.low,
            ymax  = conf.high,
            group = group,
            color = group
        ),
        alpha = 0.2, color = NA
    ) +
    geom_smooth(
        aes(x = nchar_align_ihs, y = estimate, group = group, color = group),
        method    = "loess",
        se        = FALSE,
        linewidth = 0.75
    ) +
    # geom_line(
    #     aes(
    #         x     = nchar_align_ihs,
    #         y     = estimate,
    #         color = group,
    #         group = group
    #     ),
    #     linewidth = 0.5
    # ) +
    scale_color_manual(values = custom_colors) +
    scale_fill_manual(values  = custom_colors) +  # So ribbons match line color
    #scale_linetype_manual(values = custom_linetypes) +
    facet_wrap(~group) +
    scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
    scale_x_continuous(
        breaks = c(-1, 0, 1),
        labels = c("-1\n              Clinton+", "0", "1\nTrump+             ")
    ) +
    labs(
        x = "Expressive Alignment 2016",
        y = element_blank() #"Predicted Probability"
    ) +
    #theme_minimal(base_size = 14) +
    theme_book() +
    theme(
        legend.position = "none",
        #strip.text.x = element_markdown(),
        #axis.text = element_text(size = 12),
        text = element_text(size = 12)
    )
```


```{r table-party-switching-16-to-24-run, results = 'hide', include = FALSE}
## ---- table-party-switching-16-to-24 ----


cov_labels <- star_var(mout24, omit = "pid.*Oth")

# Capture the output as a character vector
stargazer_output <- capture.output(
    star_ft(mout24,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    no.space         = TRUE,
    #model.names      = TRUE,
    #dep.var.labels   = c("P"),
    title = "Predicted probability of Party ID stability or switching between 2016 and 2024 vs Expressive Alignment in 2016 using multinomial model with Ind20 $\\rightarrow$ Ind24 as the reference category.",
    label = "tab:table-party-switching-16-to-24" #,
    #out = "multinomial-switcher-table.tex"
)
)

# Replace -> with $\rightarrow$
stargazer_latex_fixed <- stargazer_output %>%
    str_replace_all("-\\\\textgreater ", " $\\\\rightarrow$ ") %>%
    str_replace_all("Dem", "D") %>%
    #str_replace_all("Ind", "I") %>%
    str_replace_all("Rep", "R")

# Print to console or write to file
cat(stargazer_latex_fixed, sep = "\n")
```


```{r mout24-predict-prose, include = FALSE}
ggeffects::ggpredict(mout_all, terms = c("nchar_align_ihs"))
ggeffects::ggpredict(mout24, terms = c("nchar_align_ihs"))

```

\FloatBarrier

### Study 3: Text as Validated Turnout {#sec:study3}

\begin{footnotesize}
\begin{eqnarray}
\end{eqnarray}
\end{footnotesize}


```{r validated-vote-models, cache = cache_lgl}
# mod_valid <- list()
mod_valid_ihs <- list()


# anes <- a16 %>%
#   mutate(
#     #nchar_nr_prob_tot16 = nchar_nr_prob_tot16
#     pol_attn16 = 6-(V161003),
# 
#   )

# mod_valid[[1]] <- glm(vote_validated16 ~ nchar_nr_prob_tot16 , 
#           family = binomial, data = a16 %>% filter(reg16_bin == 1) )

# mod_valid[[2]] <- glm(vote_validated16 ~ mode16 + female16 + educ16 + age16 + race4_16 + pid4_16 + income16 + pol_attn16,
#           family = binomial, data = a16 )


# mod_valid[[2]] <- glm(vote_validated16 ~ nchar_nr_prob_tot16 + pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, 
#           family = binomial, data = a16  %>% filter(reg16_bin == 1) )

# mod_valid[[3]] <- glm(vote_validated16 ~ nchar_nr_prob_tot16 * pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, 
#           family = binomial, data = a16  %>% filter(reg16_bin == 1) )

# mod_valid_reduced <- glm(vote_validated16 ~ pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, 
#           family = binomial, data = a16  %>% filter(reg16_bin == 1) )


mod_valid_ihs[[1]] <- glm(vote2016_prob ~ nchar_problems_ihs, 
          family = quasibinomial, data = a16  %>% filter(reg16_bin == 1) )

mod_valid_ihs[[2]] <- glm(vote2016_prob ~ nchar_problems_ihs + likely_vote16,  
          family = quasibinomial, data = a16  %>% filter(reg16_bin == 1) )

mod_valid_ihs[[3]] <- glm(vote2016_prob ~ nchar_problems_ihs + pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, 
          family = quasibinomial, data = a16  %>% filter(reg16_bin == 1) )


mod_valid_ihs[[4]] <- glm(vote2016_prob ~ nchar_problems_ihs * pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, 
          family = quasibinomial, data = a16 %>% filter(reg16_bin == 1) )

#mod_valid[[4]] <- glm(vote_validated16 ~ nchar_nr_prob_tot16 * pid4_16 + mode16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16, 
 #         family = binomial, data = a16 )


# mod_valid[[4]] <- glm(vote_validated16 ~ nchar_nr_prob_tot16 * pid4_16 + mode16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + racial_resent16 + sexism16 + authorit16, 
#           family = binomial, data = a16 )

# mod_valid[[5]] <- glm(vote_validated16 ~ nchar_nr_prob_tot16 * pid4_16 + mode16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + racial_resent16 + sexism16 + authorit16, 
#           family = binomial, data = a16 )

```


```{r turnout-vs-mip-plot-log, fig.height = 3.75, fig.width = 8, fig.cap = "Marginal effects of Expressive Engagement in 2016 on validated turnout in 2016, without interaction with Party ID (A) and with interaction (B). Logistic regression model controls for female, education, age, race, party identification, income, mode (web or face-to-face) and political interest among registered voters (see Table\ \\ref{tab:validated-vote-table}).", include = FALSE, cache = cache_lgl}
# No interaction with party identification is shown as there is little differentiation by partisanship.
# Total writing scale is the sum of four questions calculated as follows: Writing Scale = 100 + (sum of characters) OR (nonresponse = -100). Model includes a standard set of controls.
# Model controls include educ16, age16, female16, race16, party ID, attention to politics, income16, racial resentment, hostile sexism16, and authoritarianism.

p_valid_ihs <- plot_model(mod_valid_ihs[[3]], type = "pred", terms = c("nchar_problems_ihs [all]"), colors = "gs") +
               #c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca")) +
    #aes(linetype = group, color = group) +
  #scale_x_continuous(limits = c(0, 2150), breaks = seq(0, 2150, 500)) +
  scale_y_continuous(
    limits = c(.54, .9),
    labels = scales::percent_format(accuracy = 1),
    breaks = seq(.6, .9, .1)
  ) +
  labs(x = "Expressive Engagement 2016",
       y = "Pr(Validated Turnout 2016)",
       title = element_blank(),
       #color = "Party ID",
       #linetype = "Party ID"
       #title = "Turnout vs Text"#,
       #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
       ) +
       theme_book() +
         theme(text = element_text(size = 12),
               axis.title.y = element_text(margin = margin(r = 14), angle = 90, vjust = 0.5)
)


p_valid_ihs_int <- plot_model(mod_valid_ihs[[3]], type = "pred", terms = c("nchar_problems_ihs [all]", "pid4_16 [Dem,Ind,Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca")) +
    aes(linetype = group, color = group) +
  #scale_x_continuous(limits = c(0, 2150), breaks = seq(0, 2150, 500)) +
  scale_y_continuous(
    limits = c(.5, .9),
    labels = scales::percent_format(accuracy = 1),
    breaks = seq(.5, .9, .1)
  ) +
  labs(x = "Expressive Engagement 2016", 
       y = element_blank(), #"Pr(Validated Turnout 2016)",
       title = element_blank(),
       color = "Party ID",
       linetype = "Party ID"
       #title = "Turnout vs Text"#,
       #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
       ) +
       theme_book() +
         theme(text = element_text(size = 12))


# p_valid_ihs + p_valid_ihs_int + 
#   plot_layout(guides = "collect") +
#   plot_annotation(tag_levels = "A")

```


```{r include = FALSE}
# generate preddicted probabilities to use in prose
ggeffects::ggpredict(mod_valid_ihs[[3]], terms = c("nchar_problems_ihs [0,1,2,3,3.65]"))

# ggeffects::ggpredict(mod_valid_ihs[[3]], terms = c("nchar_problems_ihs", "pid4_16"))

```


```{r turnout2020-models, include = FALSE}
## ---- turnout-models, results = 'asis' ----


# 2016 predictors on 2020 outcome
tout_all <- a20 %>%  
    filter(reg16_bin == 1) %>% # registered voters
    glm(vote_valid20_weighted ~ nchar_problems_ihs + pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, family = quasibinomial, data = .) 


tout_all_int <- a20 %>%  
    filter(reg16_bin == 1) %>% # registered voters
    glm(vote_valid20_weighted ~ nchar_problems_ihs * pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, family = quasibinomial, data = .) 

tout_reduced <- tout_all$model %>%  
    #filter(reg16_bin == 1) %>% # registered voters
    glm(vote_valid20_weighted ~ pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, family = quasibinomial, data = .) 

```


```{r turnout2020-models-table, results = 'asis', include = FALSE}

models <- list(tout_all, tout_all_int) # tout_reduced, 

cov_labels <- star_var(models, omit = "pid.*Oth")

star_ft(models,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    no.space         = TRUE,
    model.names      = TRUE,
    dep.var.labels   = c("Validated Turnout 2020"),
    title = "Validated turnout incorporating match probability in 2020 vs Expressive Engagement Scale from 2016.",
    label = "tab:turnout2020-models-table"
)

# star_ft()#, tout_all_int)


a_out  <- anova(tout_reduced, tout_all, test = "Chisq")

# Print the results using custom print_models function
print_models(a_out)

print_anova(a_out, "Table of Likelihood Ratio Test for Validated Turnout in 2020 vs Models with and without Expressive Engagement Scale in 2016 Interacted with Party ID")

```


```{r turnout-models-plots, include = FALSE}


## ---- turnout-models-plots ----

tout_all_plot1 <- tout_all %>% 
    plot_model(type = "pred", terms = c("nchar_problems_ihs [all]"), colors = "gs") +
    scale_y_continuous(
        limits = c(.54, .9),
        labels = scales::percent_format(accuracy = 1),
        breaks = seq(.6, .9, .1)
    ) +
    labs(x = "Expressive Engagement 2016",
         y = "Pr(Validated Turnout 2020)",
         title = element_blank(),
         #color = "Party ID",
         #linetype = "Party ID"
         #title = "Turnout vs Text"#,
         #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
    ) +
    theme_book() +
         theme(text = element_text(size = 12),
               axis.title.y = element_text(margin = margin(r = 14), angle = 90, vjust = 0.5)
)


tout_all_plot2 <- tout_all_int %>% 
    plot_model(type = "pred", terms = c("nchar_problems_ihs [all]", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Other = "#cbcaca")) +
    aes(linetype = group, color = group) + 
    scale_y_continuous(
                limits = c(.5, .9),
        labels = scales::percent_format(accuracy = 1),
                breaks = seq(.5, .9, .1)
    ) +
    labs(x = "Expressive Engagement 2016",
#         y = "Pr(Validated Turnout in 2020)",
         y = element_blank(),
         title = element_blank(),
         color = "Party ID",
         linetype = "Party ID"
         #title = "Turnout vs Text"#,
         #caption = "Writing Scale = 100 + (sum of characters) OR (nonresponse = -100)"
    ) +
    theme_book() +
       theme(text = element_text(size = 13))

# tout_all_plot1 + tout_all_plot2 + 
#     plot_layout(guides = "collect") +
#     plot_annotation(tag_levels = "A")


```


```{r figure-4-turnout-models-plots-patch, fig.height = 3.75, fig.width = 8, fig.cap = "Marginal effects of Expressive Engagement in 2016 on validated turnout incorporating match probability in (A) 2016 and (B) 2020 using quasibinomial models (see Table\ \\ref{tab:turnout-models-table}).", include = TRUE, cache = cache_lgl}


## ---- figure-4-turnout-models-plots-patch, fig.height = 3, fig.width = 7 ----


p_valid_ihs + tout_all_plot1 + 
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A") #+
    # theme(
    #         text = element_text(family = "Verdana")
    #     )


```


```{r include = FALSE}
## ---- predict_turnout20 ----

max(a20$nchar_problems_ihs)

ggeffects::ggpredict(tout_all, terms = c("nchar_problems_ihs"))


```


```{r turnout-models-table-run, results = 'hide', include = FALSE, cache = cache_lgl}


models <- list(mod_valid_ihs[[3]], tout_all)

cov_labels <- star_var(models, omit = "pid.*Oth")

star_ft(models,
    covariate.labels = cov_labels,
    no.space         = TRUE,
    model.names      = TRUE,
    omit             = "pid.*Oth",
    dep.var.labels = c("Turnout 2016", "Turnout 2020"),
    title = "Validated Turnout incorporating match probability in 2016 and 2020 vs Expressive Engagement in 2016",
    label = "tab:turnout-models-table"
)


```

\FloatBarrier


```{r combined-mip-non-models, include = FALSE}
# + I(age16^2)
mod_valid_ihs_comb <- list()

mod_valid_ihs_comb[[1]] <- glm(vote_validated16 ~ nchar_problems_ihs + pid4_16 * nonresp_lddr16_fct + female16 + educ16 + age16  + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, family = binomial, data = a16  %>% filter(reg16_bin == 1) )

mod_valid_ihs_comb[[2]] <- glm(vote_validated16 ~ nchar_problems_ihs + pid4_16 * nonresp_lrdd16_fct + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16 + likely_vote16, family = binomial, data = a16  %>% filter(reg16_bin == 1) )


# mod_valid_ihs_comb[[3]] <- glm(vote_validated16 ~ nchar_problems_ihs * pid4_16 * nonresp_lddr16_fct + mode16 + female16 + educ16 + age16  + race4_16 + income16 + pol_attn16, family = binomial, data = a16  %>% filter(reg16_bin == 1) )
# 
# mod_valid_ihs_comb[[4]] <- glm(vote_validated16 ~ nchar_problems_ihs * pid4_16 * nonresp_lrdd16_fct + mode16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16, family = binomial, data = a16  %>% filter(reg16_bin == 1) )
```

\FloatBarrier

### Study 4: Text as Behavioral Outcome {#sec:study4}


```{r load_processed_afro_data, include = FALSE, cache = cache_lgl}

## ----  load_processed_afro_data ----
load(here("text_data_output/afrobarometer_processed.Rdata"))

# view distribution of languages
afro$lang %>% table()

```


```{r regression1_afro, cache = cache_lgl}


## ----  regression1_afro ----
# full regression w/ language as explanatory factor
mod_dem_bi <- afro %>% 
    filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>%
    glm.nb(nchar_dem_total ~ dem_importance, # without race
           data = .
    )

mod_dem_red <- afro %>% 
    filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>% 
    glm.nb(nchar_dem_total ~ dem_importance + lang, # without race
           data = .
    )

mod_dem <- afro %>% 
    filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>% 
    glm.nb(nchar_dem_total ~ dem_importance + lang + gender + educ + age + income_proxy, # without race
  data = .
)

mod_dem_race <- afro %>% 
    filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>% 
    glm.nb(nchar_dem_total ~ dem_importance + lang + gender + educ + age + income_proxy + race5, # with race
           data = .
    )

mod_dem_int <- afro %>% 
    filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>% 
  glm.nb(nchar_dem_total ~ dem_importance * lang + gender + educ + age + income_proxy + race5, # interaction
  data = .
)

```


```{r figure-5-plot1-afro, fig.height = 3.5, fig.width = 8, fig.cap = "Marginal effects of Importance of Democracy Scale on number of characters written in response to open-ended prompts asking 'What 'democracy' means to you?', by language. The negative binomial model includes controls for gender, education, age, race and income-proxy (see Table \\ref{tab:sum2-afro-table}).", cache = FALSE}


## ----  figure-5-plot1-afro, fig.cap = "nchar vs importance of democracy by language", fig.align = "center", fig.pos = "h", results = 'asis', fig.height = 5 ----
# plot of interaction model
plot_model(mod_dem_int,
           show.legend = FALSE,
           type   = "pred",
           terms  = c("dem_importance", "lang"),
           colors = c(English = "#013388", French = "#cc0000", Portuguese = "mediumpurple")) + 
    aes(color = group, linetype = group) +
    labs(y = "Predicted # of Characters:\n'What democracy means to you?'",
         x = "Importance of Democracy Scale") +
    scale_x_continuous(#limits = c(0, 50), #breaks = seq( 10, 20, 30, 40),
                       labels = c("   Low", "", "", "", "", "High   ")) +  
  labs(title = element_blank()) +
  facet_wrap(~group) +
  theme_book() +
    theme(text = element_text(size = 13))


```

\FloatBarrier

# Discussion {#sec:discussion}

# Conclusion {#sec:conclusion}

\newpage

# References {-}


\newpage


\appendix

# Appendix {#sec:appendix}

## Study 1: Candidate Choice and Party Identification vs Nonresponse and Expressive Alignment {#sec:appendix-study1}

#### Study 1a: Candidate Choice vs Nonresponse with Candidate Evaluation Prompts 


```{r dedupe, include = FALSE}
# dedupe vectors so only first item of each run is kept
# cleans up tables with repeating info and solves
# bug in kable that couldn't handle two column grouping
dedupe_runs <- function(x, keep = 1, replacement = "") {
  r <- rle(x)
  run_id <- rep(seq_along(r$lengths), r$lengths)
  pos_in_run <- ave(seq_along(x), run_id, FUN = seq_along)
  ifelse(pos_in_run <= keep, x, replacement)
}
```


```{r nonresp-table-small-build, results = 'asis', include = FALSE, eval = TRUE}
lddr_tab <- a16 %>% 
    tabyl(nonresp_lddr16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame()

lddr_tab_by_party <- a16 %>% 
    tabyl(nonresp_lddr16_fct, pid4_16) %>% 
    adorn_percentages() %>% 
    adorn_pct_formatting(digits = 0) %>% 
    as.data.frame()

lddr_tab_tot_and_by_party <- cbind(lddr_tab, lddr_tab_by_party[ ,-1])

names(lddr_tab_tot_and_by_party)[1] <- "nonresp"

lrdd_tab <- a16 %>% 
    tabyl(nonresp_lrdd16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>% 
    as.data.frame()

lrdd_tab_by_party <- a16 %>% 
    tabyl(nonresp_lrdd16_fct, pid4_16) %>% 
    adorn_percentages() %>% 
    adorn_pct_formatting(digits = 0) %>% 
    as.data.frame()

lrdd_tab_tot_and_by_party <- cbind(lrdd_tab, lrdd_tab_by_party[ ,-1])

names(lrdd_tab_tot_and_by_party)[1] <- "nonresp"

# conditional on nonresponding, what is the distribution by party id
tab_tot_and_by_party <- rbind(lddr_tab_tot_and_by_party, lrdd_tab_tot_and_by_party)


lddr_tab_dem <- a16 %>% 
    filter(pid4_16 == "Dem") %>% 
    tabyl(nonresp_lddr16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Democrat", 
           `Candidate Evaluation` = "For Clinton + Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lddr16_fct)


lddr_tab_rep <- a16 %>% 
    filter(pid4_16 == "Rep") %>% 
    tabyl(nonresp_lddr16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Republican", 
           `Candidate Evaluation` = "For Clinton + Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lddr16_fct)


lddr_tab_ind <- a16 %>% 
    filter(pid4_16 == "Ind") %>% 
    tabyl(nonresp_lddr16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Independent", 
           `Candidate Evaluation` = "For Clinton + Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lddr16_fct)


lrdd_tab_dem <- a16 %>% 
    filter(pid4_16 == "Dem") %>% 
    tabyl(nonresp_lrdd16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>% 
    as.data.frame() %>% 
    mutate(`Party ID` = "Democrat", 
           `Candidate Evaluation` = "For Trump + Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lrdd16_fct)


lrdd_tab_rep <- a16 %>% 
    filter(pid4_16 == "Rep") %>% 
    tabyl(nonresp_lrdd16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>% 
    as.data.frame() %>% 
    mutate(`Party ID` = "Republican", 
           `Candidate Evaluation` = "For Trump + Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lrdd16_fct)

lrdd_tab_ind <- a16 %>% 
    filter(pid4_16 == "Ind") %>% 
    tabyl(nonresp_lrdd16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>% 
    as.data.frame() %>% 
    mutate(`Party ID` = "Independent", 
           `Candidate Evaluation` = "For Trump + Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lrdd16_fct)


lddr_tab_oth <- a16 %>% 
    filter(pid4_16 == "Other") %>% 
    tabyl(nonresp_lddr16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Other", 
           `Candidate Evaluation` = "For Clinton + Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lddr16_fct)

lrdd_tab_oth <- a16 %>% 
    filter(pid4_16 == "Other") %>% 
    tabyl(nonresp_lrdd16_fct) %>% 
    adorn_pct_formatting(digits = 0) %>% 
    as.data.frame() %>% 
    mutate(`Party ID` = "Other", 
           `Candidate Evaluation` = "For Trump + Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_lrdd16_fct)

# conditional on party, what is the distribution over nonresponse
party_and_nonresp_tab <- rbind(lddr_tab_dem, lrdd_tab_dem, lddr_tab_ind, lrdd_tab_ind, lddr_tab_rep, lrdd_tab_rep)
#party_and_nonresp_tab

names(lddr_tab)[1] <- c("nonresp")
lddr_tab <- data.frame(lddr_tab, valence = "For Clinton + Against Trump")

names(lrdd_tab)[1] <- c("nonresp")
lrdd_tab <- data.frame(lrdd_tab, valence = "For Trump + Against Clinton")

nonresp_tab <- rbind(lddr_tab, lrdd_tab)
nonresp_tab <- nonresp_tab[, c(4,1,2,3)]
```


```{r party-and-nonresp-tab-pair-run, results = 'hide', include = FALSE}

shade_rows <- c(seq(from = 0, to = 17, by = 6), 
                seq(from = 1, to = 17, by = 6), 
                seq(from = 2, to = 17, by = 6)) %>% 
    sort()

# party_and_nonresp_tab$`Candidate Evaluation` <- dedupe_runs(party_and_nonresp_tab$`Candidate Evaluation`, keep = 1)


# calcuated here, presented in appendix
party_and_nonresp_tab %>%
  mutate(n = add_comma(n),
         `Candidate Evaluation` = dedupe_runs(`Candidate Evaluation`, keep = 1)) %>%
  kbl(
    format   = kable_format,
    booktabs = TRUE,
    caption  = "Nonresponse by Party ID and Congruent Candidate Evaluation Prompts",
    col.names = c("Party ID", "Candidate Evaluation", "# Nonresponse", "n", "Percent"),
    align = c("l", "l", "r", "r", "r")
  ) %>%
  kable_styling(latex_options = "hold_position") %>%
  row_spec(shade_rows, extra_latex_after = "\\rowcolor{gray!10}") %>%
  collapse_rows(columns = 1, latex_hline = "major", valign = "top") %>%
  sub("\\\\toprule", "", .) %>%
  sub("\\\\bottomrule", "", .) 


```


```{r party-and-nonresp-tab-single-run, results = 'hide', include = FALSE}
ld_tab_dem <- a16 %>% 
    filter(pid4_16 == "Dem") %>% 
    tabyl(nonresp_like_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Democrat", 
           `Candidate Evaluation` = "For Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_dem)


dd_tab_dem <- a16 %>% 
    filter(pid4_16 == "Dem") %>% 
    tabyl(nonresp_dislike_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Democrat", 
           `Candidate Evaluation` = "Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_dem)

lr_tab_dem <- a16 %>% 
    filter(pid4_16 == "Dem") %>% 
    tabyl(nonresp_like_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Democrat", 
           `Candidate Evaluation` = "For Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_rep)


dr_tab_dem <- a16 %>% 
    filter(pid4_16 == "Dem") %>% 
    tabyl(nonresp_dislike_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Democrat", 
           `Candidate Evaluation` = "Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_rep)

tab_dem <- rbind(ld_tab_dem, dd_tab_dem, lr_tab_dem, dr_tab_dem)

# independents
ld_tab_ind <- a16 %>% 
    filter(pid4_16 == "Ind") %>% 
    tabyl(nonresp_like_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Independent", 
           `Candidate Evaluation` = "For Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_dem)


dd_tab_ind <- a16 %>% 
    filter(pid4_16 == "Ind") %>% 
    tabyl(nonresp_dislike_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Independent", 
           `Candidate Evaluation` = "Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_dem)

lr_tab_ind <- a16 %>% 
    filter(pid4_16 == "Ind") %>% 
    tabyl(nonresp_like_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Independent", 
           `Candidate Evaluation` = "For Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_rep)


dr_tab_ind <- a16 %>% 
    filter(pid4_16 == "Ind") %>% 
    tabyl(nonresp_dislike_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Independent", 
           `Candidate Evaluation` = "Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_rep)

tab_ind <- rbind(ld_tab_ind, dd_tab_ind, lr_tab_ind, dr_tab_ind)


# republicans
ld_tab_rep <- a16 %>% 
    filter(pid4_16 == "Rep") %>% 
    tabyl(nonresp_like_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Republican", 
           `Candidate Evaluation` = "For Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_dem)


dd_tab_rep <- a16 %>% 
    filter(pid4_16 == "Rep") %>% 
    tabyl(nonresp_dislike_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Republican", 
           `Candidate Evaluation` = "Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_dem)

lr_tab_rep <- a16 %>% 
    filter(pid4_16 == "Rep") %>% 
    tabyl(nonresp_like_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Republican", 
           `Candidate Evaluation` = "For Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_rep)


dr_tab_rep <- a16 %>% 
    filter(pid4_16 == "Rep") %>% 
    tabyl(nonresp_dislike_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Republican", 
           `Candidate Evaluation` = "Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_rep)

tab_rep <- rbind(ld_tab_rep, dd_tab_rep, lr_tab_rep, dr_tab_rep)


# other
ld_tab_oth <- a16 %>% 
    filter(pid4_16 == "Other") %>% 
    tabyl(nonresp_like_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Other", 
           `Candidate Evaluation` = "For Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_dem)


dd_tab_oth <- a16 %>% 
    filter(pid4_16 == "Other") %>% 
    tabyl(nonresp_dislike_dem) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Other", 
           `Candidate Evaluation` = "Against Clinton",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_dem)

lr_tab_oth <- a16 %>% 
    filter(pid4_16 == "Other") %>% 
    tabyl(nonresp_like_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Other", 
           `Candidate Evaluation` = "For Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_like_rep)


dr_tab_oth <- a16 %>% 
    filter(pid4_16 == "Other") %>% 
    tabyl(nonresp_dislike_rep) %>% 
    adorn_pct_formatting(digits = 0) %>%  
    as.data.frame() %>% 
    mutate(`Party ID` = "Other", 
           `Candidate Evaluation` = "Against Trump",
           .before = 1) %>% 
    rename(`Nonresponse` = nonresp_dislike_rep)

tab_oth <- rbind(ld_tab_oth, dd_tab_oth, lr_tab_oth, dr_tab_oth)


# conditional on party, what is the distribution over nonresponse
party_and_nonresp_tab2 <- rbind(tab_dem, tab_ind, tab_rep, tab_oth)

#party_and_nonresp_tab2

# names(lddr_tab)[1] <- c("nonresp")
# lddr_tab <- data.frame(lddr_tab, valence = "For Clinton + Against Trump")
# 
# names(lrdd_tab)[1] <- c("nonresp")
# lrdd_tab <- data.frame(lrdd_tab, valence = "For Trump + Against Clinton")
# 
# nonresp_tab <- rbind(lddr_tab, lrdd_tab)
# nonresp_tab <- nonresp_tab[, c(4,1,2,3)]


shade_rows <- c(seq(from = 0, to = 23, by = 4), seq(from = 1, to = 23, by = 4)) %>% sort()

# party_and_nonresp_tab2$`Candidate Evaluation` <- dedupe_runs(party_and_nonresp_tab2$`Candidate Evaluation`, keep = 1)

# calcuated here, presentd in appendix
party_and_nonresp_tab2 %>% 
    mutate(n = add_comma(n),
           `Candidate Evaluation` = dedupe_runs(`Candidate Evaluation`, keep = 1)) %>% 
  kbl(
    format   = kable_format,
    booktabs = TRUE,
    caption  = "Nonresponse by Party ID and Individual Candidate Evaluation Prompts",
    col.names = c("Party ID", "Candidate Evaluation", "# Nonresponse", "n", "Percent"),
    align     = c("l", "l", "r", "r", "r") #,
    #linesep   = c('', '\\addlinespace') # overwritten by collapse_rows
    ) %>% 
  kable_styling(latex_options = c("hold_position")) %>% 
  row_spec(shade_rows, extra_latex_after = "\\rowcolor{gray!10}") %>%
  collapse_rows(columns = 1, latex_hline = "major", valign = "top") %>% 
  #collapse_rows(columns = 1:2, row_group_label_position = "stack") %>%
    # pack_rows("For Clinton / Against Trump", 1, 3) %>% 
    # pack_rows("For Trump / Against Clinton", 4, 6, latex_gap_space = "1em")
    sub("\\\\toprule",    "", .) %>%
    #sub("\\\\midrule",    "", .) %>% 
    sub("\\\\bottomrule", "", .) 

```


```{r party-and-nonresp-tab-single, results = 'asis', include = TRUE}
shade_rows <- c(seq(from = 0, to = 23, by = 4), seq(from = 1, to = 23, by = 4)) %>% sort()
party_and_nonresp_tab2 %>%
    mutate(n = add_comma(n),
           `Candidate Evaluation` = dedupe_runs(`Candidate Evaluation`, keep = 1)) %>%
  kbl(
    format   = kable_format,
    booktabs = TRUE,
    caption  = "Nonresponse by Party ID and Individual Candidate Evaluation Prompts",
    col.names = c("Party ID", "Candidate Evaluation", "# Nonresponse", "n", "Percent"),
    align     = c("l", "l", "r", "r", "r")
    ) %>%
  kable_styling(latex_options = c("hold_position")) %>%
  row_spec(shade_rows, extra_latex_after = "\\rowcolor{gray!10}") %>%
  collapse_rows(columns = 1, latex_hline = "major", valign = "top") %>%
    sub("\\\\toprule",    "", .) %>%
    sub("\\\\bottomrule", "", .)
```


```{r party-and-nonresp-tab-pair, results = 'asis', include = TRUE}
shade_rows <- c(seq(from = 0, to = 17, by = 6),
                seq(from = 1, to = 17, by = 6),
                seq(from = 2, to = 17, by = 6)) %>%
    sort()
party_and_nonresp_tab %>%
  mutate(n = add_comma(n),
         `Candidate Evaluation` = dedupe_runs(`Candidate Evaluation`, keep = 1)) %>%
  kbl(
    format   = kable_format,
    booktabs = TRUE,
    caption  = "Nonresponse by Party ID and Congruent Candidate Evaluation Prompts",
    col.names = c("Party ID", "Candidate Evaluation", "# Nonresponse", "n", "Percent"),
    align = c("l", "l", "r", "r", "r")
  ) %>%
  kable_styling(latex_options = "hold_position") %>%
  row_spec(shade_rows, extra_latex_after = "\\rowcolor{gray!10}") %>%
  collapse_rows(columns = 1, latex_hline = "major", valign = "top") %>%
  sub("\\\\toprule", "", .) %>%
  sub("\\\\bottomrule", "", .)
```

\FloatBarrier

### Study 1a: Voting Behavior by Registration Status in 2016 and 2020 {#sec:vote-by-reg-status1620}

```{r voted-by-registration-table16, results = 'asis'}
vote_by_reg_n <- a16 %>% 
  tabyl(reg16_bin, vote_validated16)


vote_by_reg_perc <- a16 %>% 
  tabyl(reg16_bin, vote_validated16) %>% 
  adorn_percentages() %>% 
  adorn_pct_formatting(1)

# drop row names
vote_by_reg_n <- vote_by_reg_n[, -1] %>% 
    add_comma()
vote_by_reg_perc <- vote_by_reg_perc[, -1]

# combine stats
combined_table <- matrix(
  mapply(function(n, p) paste(n, " (", p, ")", sep = ""),
         vote_by_reg_n, vote_by_reg_perc),
  nrow = 2, # specify the number of rows to match the original tables
  byrow = FALSE
)

rownames(combined_table) <- c("Not Registered", "Registered")

combined_table_tab <- combined_table %>% 
      kbl(
    align     = c("r", "r"),     
    format    = "latex",
    booktabs  = TRUE,
    caption   = "Turnout by Registration Status in 2016 \\label{tab:vote_by_reg16}",
    col.names = c("", "Did Not Vote", "Voted")
    ) %>% 
  kable_styling(latex_options = "hold_position")

# kableExtra::save_kable(x = combined_table_tab, file = here("text_docs", "combined-table-tab.tex"))

combined_table_tab


# a16 %>% 
#     mutate(reg16_bin_fct = 
#         case_when(reg16_bin == 0 ~ "Not Registered",
#                   reg16_bin == 1 ~ "Registered")
#     ) %>% 
#   tabyl(reg16_bin_fct, vote_validated16) %>% 
#   adorn_percentages() %>% 
#   adorn_pct_formatting(1) %>% 
#   kbl(
#     format   = kable_format,
#     booktabs  = TRUE,
#     caption   = "Voting Behavior by Registration Status \\label{tab:vote_by_reg2}",
#     col.names = c("", "Did Not Vote", "Voted"),
#     ) %>% 
#   kable_styling(latex_options = "hold_position") 

```


```{r voted-by-registration-table20-fct, results = 'asis'}
# Create the basic tabyl with counts
reg_turnout_n <- a20 %>% 
    filter(!is.na(reg20_bin)) %>% 
    filter(!is.na(vote_validated20_bin)) %>% 
  tabyl(reg20_bin, vote_validated20_bin)

# Create the tabyl with percentages
reg_turnout_perc <- a20 %>% 
    filter(!is.na(reg20_bin)) %>% 
    filter(!is.na(vote_validated20_bin)) %>% 
  tabyl(reg20_bin, vote_validated20_bin) %>% 
  adorn_percentages() %>% 
  adorn_pct_formatting(1)

# Drop row names (first column)
reg_turnout_n <- reg_turnout_n[, -1] %>% 
  add_comma()
reg_turnout_perc <- reg_turnout_perc[, -1]

# Combine counts and percentages
combined_table <- matrix(
  mapply(function(n, p) paste(n, " (", p, ")", sep = ""),
         reg_turnout_n, reg_turnout_perc),
  nrow = nrow(reg_turnout_n), # number of rows based on your data
  byrow = FALSE
)

# Set row names based on your reg20_bin categories
rownames(combined_table) <- c("Not Registered", "Registered")

# Create the formatted table
combined_table20_tab <- combined_table %>% 
  kbl(
    align     = c("r", "r"),     
    format    = "latex",
    booktabs  = TRUE,
    caption   = "Turnout by Registration Status in 2020 \\label{tab:turnout_by_reg}",
    col.names = c("", "Did Not Vote", "Voted")  
  ) %>% 
  kable_styling(latex_options = "hold_position")

# Save the table
# kableExtra::save_kable(x = combined_table20_tab, 
#                       file = here("text_docs", "reg-turnout-table20.tex"))

# Display the table
combined_table20_tab
```

\FloatBarrier


### Study 1a: Number of character summary statistics for candidate for/against responses by mode and year {#sec:summ-stats-partisan}


```{r summ-stats-partisan-table-2016, results = 'asis'}

# Create summary stats per mode16 for each evaluation
like_dislike_summary <- bind_rows(
    a16 %>%
        group_by(mode16) %>%
        summarize(
            Variable = "For Clinton",
            max      = max(nchar_like_dem, na.rm = TRUE),
            mean     = mean(nchar_like_dem, na.rm = TRUE),
            median   = median(nchar_like_dem, na.rm = TRUE)
        ),
    a16 %>%
        group_by(mode16) %>%
        summarize(
            Variable = "Against Clinton",
            max      = max(nchar_dislike_dem, na.rm = TRUE),
            mean     = mean(nchar_dislike_dem, na.rm = TRUE),
            median  = median(nchar_dislike_dem, na.rm = TRUE)
        ),
    a16 %>%
        group_by(mode16) %>%
        summarize(
            Variable = "For Trump",
            max      = max(nchar_like_rep, na.rm = TRUE),
            mean     = mean(nchar_like_rep, na.rm = TRUE),
            median   = median(nchar_like_rep, na.rm = TRUE)
        ),
    a16 %>%
        group_by(mode16) %>%
        summarize(
            Variable = "Against Trump",
            max      = max(nchar_dislike_rep, na.rm = TRUE),
            mean     = mean(nchar_dislike_rep, na.rm = TRUE),
            median   = median(nchar_dislike_rep, na.rm = TRUE)
        )
) %>%
    pivot_wider(
        names_from  = mode16,
        values_from = c(max, mean, median),
        names_glue  = "{mode16}_{.value}"
    ) %>%
    rename(
        `Variable` = Variable,
        `FTF`      = ftf_max,
        `Web`      = web_max,
        `FTF `     = ftf_mean,  # 1 space for uniqueness
        `Web `     = web_mean,  # 1 space for uniqueness
        `FTF  `    = ftf_median,# 2 spaces for uniqueness
        `Web  `    = web_median # 2 spaces for uniqueness
    )

# Output with kable
like_dislike_summary %>%
    kable0(
        caption = "Number of character summary statistics for candidate for/against responses by survey mode16 (2016)",
        #format = "latex", # or "latex"
        digits = 1,
        align = "lrrrrrr"
    ) %>%
        add_header_above(
        c(" " = 1, "Max" = 2, "Mean" = 2, "Median" = 2)
    ) %>% 
        footnote(
          general = "Face-to-face: n = 1,180, Web: n = 3,090.",
          general_title = "\\\\vspace{4mm}Note: ",
          footnote_as_chunk = TRUE,
          escape = FALSE
    ) #%>%
    #kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))

```


```{r summ-stats-partisan-table-2020, results = 'asis'}

# Create summary stats per mode16 for each evaluation
like_dislike_summary_2020 <- bind_rows(
    a20 %>%
        group_by(mode_pre20) %>%
        summarize(
            Variable = "For Biden",
            max      = max(text_dem20_like_nchar, na.rm = TRUE),
            mean     = mean(text_dem20_like_nchar, na.rm = TRUE),
            median   = median(text_dem20_like_nchar, na.rm = TRUE)
        ),
    a20 %>%
        group_by(mode_pre20) %>%
        summarize(
            Variable = "Against Biden",
            max      = max(text_dem20_dislike_nchar, na.rm = TRUE),
            mean     = mean(text_dem20_dislike_nchar, na.rm = TRUE),
            median   = median(text_dem20_dislike_nchar, na.rm = TRUE)
        ),
    a20 %>%
        group_by(mode_pre20) %>%
        summarize(
            Variable = "For Trump",
            max      = max(text_rep20_like_nchar, na.rm = TRUE),
            mean     = mean(text_rep20_like_nchar, na.rm = TRUE),
            median   = median(text_rep20_like_nchar, na.rm = TRUE)
        ),
    a20 %>%
        group_by(mode_pre20) %>%
        summarize(
            Variable = "Against Trump",
            max      = max(text_rep20_dislike_nchar, na.rm = TRUE),
            mean     = mean(text_rep20_dislike_nchar, na.rm = TRUE),
            median   = median(text_rep20_dislike_nchar, na.rm = TRUE)
        )
) %>%
    pivot_wider(
        names_from  = mode_pre20,
        values_from = c(max, mean, median),
        names_glue  = "{mode_pre20}_{.value}"
    ) %>%
    rename(
        `Variable`      = Variable,
        # Shorten column names - remove statistic from name
        `Tele`          = tele_max,
        `Video`         = video_max,
        `Web`           = web_max,
        `Tele `         = tele_mean,      # Note the space to make unique
        `Video `        = video_mean,     # Note the space to make unique  
        `Web `          = web_mean,       # Note the space to make unique
        `Tele  `        = tele_median,    # Note the double space to make unique
        `Video  `       = video_median,   # Note the double space to make unique
        `Web  `         = web_median      # Note the double space to make unique
    )


# Output with kable
like_dislike_summary_2020 %>%
    kable0(
        caption = "Number of character summary statistics for candidate for/against responses by survey mode16 (2020)",
        digits = 1,
        align = "lrrrrrrrrr"
    ) %>%
    add_header_above(
        c(" " = 1, "Max" = 3, "Mean" = 3, "Median" = 3)
    ) %>% 
        footnote(
          general = "Tele: n = 139, Video: n = 359, Web: n = 7,782.",
          general_title = "\\\\vspace{4mm}Note: ",
          footnote_as_chunk = TRUE,
          escape = FALSE
    ) #%>%
    # kable_styling(
    #     full_width = FALSE,
    #     bootstrap_options = c("striped", "hover")
    # )

```


```{r, summ-stats-partisan-table-2024, results = 'asis'}
# Create summary stats per mode16 for each evaluation (excluding paper mode16 with no text data)
like_dislike_summary_2024 <- bind_rows(
    a24 %>%
        filter(mode_pre24 != "paper") %>%  # Exclude paper mode16
        group_by(mode_pre24) %>%
        summarize(
            Variable = "For Harris",
            max      = max(text_dem24_like_nchar, na.rm = TRUE),
            mean     = mean(text_dem24_like_nchar, na.rm = TRUE),
            median   = median(text_dem24_like_nchar, na.rm = TRUE)
        ),
    a24 %>%
        filter(mode_pre24 != "paper") %>%  # Exclude paper mode16
        group_by(mode_pre24) %>%
        summarize(
            Variable = "Against Harris",
            max      = max(text_dem24_dislike_nchar, na.rm = TRUE),
            mean     = mean(text_dem24_dislike_nchar, na.rm = TRUE),
            median   = median(text_dem24_dislike_nchar, na.rm = TRUE)
        ),
    a24 %>%
        filter(mode_pre24 != "paper") %>%  # Exclude paper mode16
        group_by(mode_pre24) %>%
        summarize(
            Variable = "For Trump",
            max      = max(text_rep24_like_nchar, na.rm = TRUE),
            mean     = mean(text_rep24_like_nchar, na.rm = TRUE),
            median   = median(text_rep24_like_nchar, na.rm = TRUE)
        ),
    a24 %>%
        filter(mode_pre24 != "paper") %>%  # Exclude paper mode16
        group_by(mode_pre24) %>%
        summarize(
            Variable = "Against Trump",
            max      = max(text_rep24_dislike_nchar, na.rm = TRUE),
            mean     = mean(text_rep24_dislike_nchar, na.rm = TRUE),
            median   = median(text_rep24_dislike_nchar, na.rm = TRUE)
        )
) %>%
    pivot_wider(
        names_from  = mode_pre24,
        values_from = c(max, mean, median),
        names_glue  = "{mode_pre24}_{.value}"
    ) %>%
    rename(
        `Variable`      = Variable,
        # Three mode16 categories (excluding paper)
        `FTF`           = ftf_max,
        `Tele`          = tele_max,
        `Web`           = web_max,
        `FTF `          = ftf_mean,       # Note the space to make unique
        `Tele `         = tele_mean,      # Note the space to make unique
        `Web `          = web_mean,       # Note the space to make unique
        `FTF  `         = ftf_median,     # Note the double space to make unique
        `Tele  `        = tele_median,    # Note the double space to make unique
        `Web  `         = web_median      # Note the double space to make unique
    )

# Output with kable
like_dislike_summary_2024 %>%
    kable0(
        caption = "Number of character summary statistics for candidate for/against responses by survey mode16 (2024)",
        digits = 1,
        align = "lrrrrrrrrr"  # Back to 9 columns (1 + 3*3)
    ) %>%
    add_header_above(
        c(" " = 1, "Max" = 3, "Mean" = 3, "Median" = 3)  # Back to 3 modes
    ) %>% 
        footnote(
          general = "Pre-election, Face-to-face: n = 966, Tele: n =76, Web: n = 4,234",
          general_title = "\\\\vspace{4mm}Note: ",
          footnote_as_chunk = TRUE,
          escape = FALSE
    ) #%>%
    # kable_styling(
    #     full_width = FALSE,
    #     bootstrap_options = c("striped", "hover")
    # )

```

\newpage

### Study 1a: Candidate Choice in 2016 vs Nonresponse in 2016

```{r tables-nonresponse-lddr-lrdd, results = "asis", cache = cache_lgl}
cov_labels <- star_var(mod_nonresp16_fct, omit = c("pid.*Oth", "race16.*"))
star_ft(mod_nonresp16_fct,
        covariate.labels = cov_labels,
        dep.var.labels = rep(c("Clinton 2016", "Trump 2016"), 2),
        no.space = TRUE,
        title = "Candidate Choice vs Nonresponse (Discrete)",
        label = "tab:tables-nonresponse-lddr-lrdd",
        omit = c("pid.*Oth", "race16.*"),
        notes = c("Race variable included in models but omitted above for space.", "*$p<0.05$"),
        add.lines = list(
            c("Feeling Thermometer?", rep("\\text{No}", 2), rep("\\text{Yes}", 2))
           )
        )
```

\FloatBarrier

### Study 1a: Candidate Choice in 2020 and 2024 vs All Nonresponse in 2016 {#sec:vote20-24-vs-nonresponse-all}


\begin{footnotesize}
\begin{eqnarray}
\end{eqnarray}
\end{footnotesize}


```{r include = FALSE}

ggpredict(mod_nonresp20_fct[[1]], terms = c("nonresp_all16", "pid4_16 [Dem, Ind, Rep]"))
ggpredict(mod_nonresp20_fct[[2]], terms = c("nonresp_all16", "pid4_16 [Dem, Ind, Rep]"))

```


```{r figure-A1-1-models20-nonresponse-all, fig.height = 3.5, fig.width = 8, fig.cap = "Marginal effects of Nonresponse Scale in 2016 on probability of selecting candidate in 2020, by party identification (see Table\\ \\ref{tab:models20-nonresponse-all-table}).", cache = cache_lgl, include = TRUE}
pp20[[1]] + pp20[[2]] +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")
```


```{r models20-nonresponse-all-table, results = 'asis'}


cov_labels <- star_var(mod_nonresp20_fct, omit = "pid.*Oth")

star_ft(
    mod_nonresp20_fct,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    no.space         = TRUE,
    model.names      = TRUE,
    dep.var.labels   = c("Vote Biden 2020", "Vote Trump 2020"),
    #column.labels    = c("No Int", "No Int", "Interaction", "Interaction"),
    title = "Candidate Choice vs Nonresponse Scale",
    label = "tab:models20-nonresponse-all-table"
    # add.lines = list(
    #         c("Party ID Interaction", "\\text{No}", "\\text{No}", "\\text{Yes}", "\\text{Yes}")
    #       )
  )

```


```{r figure-A1-2-models24-nonresponse-all, fig.height = 3.5, fig.width = 8, fig.cap = "Marginal effects of Nonresponse Scale in 2016 on probability of selecting candidate in 2024, by party identification (see Table\\ \\ref{tab:models24-nonresponse-all-table}).", cache = cache_lgl, include = TRUE}
pp24[[1]] + pp24[[2]] +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")
```


```{r models24-nonresponse-all-table, results = 'asis'}


cov_labels <- star_var(mod_nonresp24_fct, omit = "pid.*Oth")

star_ft(
    mod_nonresp24_fct,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    no.space         = TRUE,
    model.names      = TRUE,
    dep.var.labels   = c("Vote Harris 2024", "Vote Trump 2024"),
    #column.labels    = c("No Int", "No Int", "Interaction", "Interaction"),
    title = "Candidate Choice vs Nonresponse Scale",
    label = "tab:models24-nonresponse-all-table"
    # add.lines = list(
    #         c("Party ID Interaction", "\\text{No}", "\\text{No}", "\\text{Yes}", "\\text{Yes}")
    #       )
  )

```

\FloatBarrier

### Study 1b: Interpreting Expressive Alignment {#sec:expressive-alignment}


```{r correlation-tables-aff, results = 'asis', cache = FALSE}


vars_aff <- c("nchar_align_ihs",  "ideo7_16", "pid7_int", "ft_clinton", "ft_trump",
              "racial_resent16", "sexism16", "authorit16", "ft_con_lib",
              "ft_trump_clinton", "ft_rep_dem")


# tidy approach
corr_aff <- a16 %>%
    dplyr::select(vars_aff) %>%
    correlate(use = "pairwise.complete.obs") %>%
    focus(nchar_align_ihs) %>%
    rename(Term = term, 
           Correlation = nchar_align_ihs) %>% 
    mutate(Correlation = round(Correlation, 3)) %>% 
    arrange(desc(abs(Correlation))) 

corr_aff$Term <- convert_labels(corr_aff$Term, extracted = TRUE)

corr_aff %>%
      kbl(
        align     = c("r", "r"),     
        format    = kable_format,
        booktabs  = TRUE,
        caption   = "Select Correlates of Expressive Alignment \\label{tab:corr-express-affect}",
        #col.names = c("", "Did Not Vote", "Voted")
        linesep = ""
      ) %>% 
      kable_styling()
# latex_options = "hold_position"

```


```{r figure-A1-3-correlation-pca-plot-aff, fig.height = 4.5, fig.width = 4.5, fig.align = 'center', fig.cap = 'Principal Component Analysis shows that Expressive Alignment (nchar\\_align\\_ihs) loads heavily on the first principal component, which captures a liberal/Clinton to conservative/Trump axis. A secondary component, defined primarily by authoritarianism and sexism, suggests a moral traditionalism dimension orthogonal to partisanship. These results indicate that higher Expressive Alignment is strongly associated with ideological and partisan alignment, but distinct from the moral/hierarchical values captured in Dimension 2.', cache = FALSE}

pca_vars_aff   <- a16[, vars_aff]
pca_vars_aff   <- scale(pca_vars_aff)
pca_result_aff <- PCA(pca_vars_aff, graph = FALSE)

# Plot variables and loadings
pca_affect <- fviz_pca_var(pca_result_aff,
                col.var = "contrib", # color by contribution to PCs
                gradient.cols = c("blue", "red"),
                repel = TRUE) + 
    labs(title = "Expressive Alignment PCA", 
         color = "Contribution")

pca_affect

```

\FloatBarrier


```{r models_vote_nchar_lddr_lrdd, include = FALSE, cache = cache_lgl}


mod_nchar_valid_ihs <- list()

mod_nchar_valid_ihs[[1]] <- glm(vote_dem16 ~ nchar_like_dem16_ihs + pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,#,
    family = binomial,
    data = a16)

mod_nchar_valid_ihs[[2]] <- glm(vote_rep16 ~ nchar_like_rep16_ihs + pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, #,
    family = binomial,
    data = a16)

mod_nchar_valid_ihs[[3]] <- glm(vote_dem16 ~ nchar_like_dem16_ihs * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16,#,
    family = binomial,
    data = a16)

mod_nchar_valid_ihs[[4]] <- glm(vote_rep16 ~ nchar_like_rep16_ihs * pid4_16 + ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, #,
    family = binomial,
    data = a16)

```


```{r include = FALSE}
# generate predicted probabilities for prose to describe plots
# perhaps automate
ggeffects::ggpredict(mod_nchar_valid_ihs[[3]], terms = c("nchar_like_dem16_ihs [0,5,7.5]", "pid4_16 [Dem, Ind, Rep]"))

ggeffects::ggpredict(mod_nchar_valid_ihs[[4]], terms = c("nchar_like_rep16_ihs [0,5,7.5]", "pid4_16 [Dem, Ind, Rep]"))
```

\FloatBarrier


```{r table-vote-nonresp-nchar-all, results = 'asis', cache = cache_lgl}
models <- list(m_dem_comb_int_ihs, m_rep_comb_int_ihs)
cov_labels <- star_var(models, omit = "pid.*Oth")
star_ft(models,
    covariate.labels = cov_labels,
    no.space         = TRUE,
    model.names      = TRUE,
    omit             = "pid.*Oth",
    dep.var.labels = c("Vote Clinton 2016", "Vote Trump 2016"),
    title = "Candidate Choice in 2016 vs Expressive Alignment 2016",
    label = "tab:table-vote-nonresp-nchar-all")
```

\FloatBarrier

### Study 1b: Candidate Choice in 2020 vs Expressive Alignment 2016 {#sec:vote20-vs-express-partisan}


```{r figure-A1-4-biden-trump-vote-comb-plot, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on candidate preference in 2020, by party identification (see Table\\ \\ref{tab:biden-trump-2020-table}).", include = TRUE, cache = cache_lgl}
biden20_plot + trump20_plot +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")
```


```{r biden-trump-2020-table, results = 'asis'}
models <- list(biden20_out, trump20_out)
cov_labels <- star_var(models, omit = "pid.*Oth")
star_ft(
    models,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    dep.var.labels   = c("Vote Biden 2020", "Vote Trump 2020"),
    no.space         = TRUE,
    title = "Candidate Choice in 2020 vs Expressive Alignment 2016",
    label = "tab:biden-trump-2020-table")
```

\FloatBarrier

### Study 1b: Candidate Choice in 2024 vs Expressive Alignment 2016 {#sec:vote24-vs-express-partisan}


```{r figure-A1-5-harris-trump-vote-comb-plot, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on candidate preference in 2024, by party identification (see Table\\ \\ref{tab:harris-trump-2024-table}).", include = TRUE, cache = cache_lgl}
harris24_plot + trump24_plot +
    plot_layout(guides = "collect") +
    plot_annotation(tag_levels = "A")
```


```{r harris-trump-2024-table, results = 'asis'}
models <- list(harris24_out, trump24_out)
cov_labels <- star_var(models, omit = "pid.*Oth")
star_ft(
    models,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    dep.var.labels   = c("Vote Harris 2024", "Vote Trump 2024"),
    no.space         = TRUE,
    title = "Candidate Choice in 2024 vs Expressive Alignment 2016",
    label = "tab:harris-trump-2024-table")
```

\FloatBarrier


## Study 2: Party Switching {#sec:appendix-study2}

### Party Switching between 2016 and 2020 vs Expressive Alignment in 2016 {#sec:party-switching-2016-2020}

```{r switcher2-multinomial-table, results = 'asis'}
cov_labels <- star_var(list(mout_all), omit = "pid.*Oth")
stargazer_output <- capture.output(
    star_ft(mout_all,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    no.space         = TRUE,
    title = "Predicted probability of Party ID stability or switching between 2016 and 2020 vs Expressive Alignment in 2016 using multinomial model with Ind16 $\\rightarrow$ Ind20 as the reference category.",
    label = "tab:switcher2-multinomial-table")
)
stargazer_latex_fixed <- stargazer_output %>%
    str_replace_all("-\\\\textgreater ", " $\\\\rightarrow$ ") %>%
    str_replace_all("Dem", "D") %>%
    str_replace_all("Rep", "R")
cat(stargazer_latex_fixed, sep = "\n")
```

\FloatBarrier

### Party Switching between 2016 and 2024 vs Expressive Alignment in 2016 {#sec:party-switching-2016-2024}

```{r figure-A1-6-model-party-switching-16-to-24, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on likelihood of maintaining same party identification between 2016 and 2024 using multinomial model (see Table\\ \\ref{tab:table-party-switching-16-to-24}).", include = TRUE, cache = cache_lgl}
plot_party_switch_16_24
```


```{r table-party-switching-16-to-24, results = 'asis'}
cov_labels <- star_var(mout24, omit = "pid.*Oth")
stargazer_output <- capture.output(
    star_ft(mout24,
    covariate.labels = cov_labels,
    omit             = "pid.*Oth",
    no.space         = TRUE,
    title = "Predicted probability of Party ID stability or switching between 2016 and 2024 vs Expressive Alignment in 2016 using multinomial model with Ind20 $\\rightarrow$ Ind24 as the reference category.",
    label = "tab:table-party-switching-16-to-24")
)
stargazer_latex_fixed <- stargazer_output %>%
    str_replace_all("-\\\\textgreater ", " $\\\\rightarrow$ ") %>%
    str_replace_all("Dem", "D") %>%
    str_replace_all("Rep", "R")
cat(stargazer_latex_fixed, sep = "\n")
```

\FloatBarrier


#### Study 2: Party Switching between 2020 and 2024 vs Expressive Alignment in 2016 {#sec:party-switching-2020-2024}


```{r model-party-switching-20-to-24}
## ---- model-party-switching-20-to-24 ----


mout20_24 <- a24 %>%  
    filter(reg16_bin == 1) %>% # registered voters
    #filter(switcher_fct == "Dem16->Dem20" | switcher_fct == "Ind16->Ind20" | switcher_fct == "Rep16->Rep20") %>% 
    multinom(switcher2024_fct ~ nchar_align_ihs +ideo7_16 + ft_trump_clinton + racial_resent16 + sexism16 + authorit16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16, data = ., trace = FALSE, Hess = TRUE)
```


```{r figure-A1-7-plot-party-switching-20-to-24, fig.height = 3.5, fig.width = 8, dev = c("cairo_pdf", "png"), fig.cap = "Marginal effects of Expressive Alignment in 2016 on likelihood of maintaining same party identification between 2020 and 2024 using multinomial model (see Table\ \\ref{tab:table-party-switching-20-to-24}).", cache = cache_lgl}
## ---- figure-A1-7-plot-party-switching-20-to-24 ----

preds <- predictions(
    mout20_24,
    newdata   = datagrid(nchar_align_ihs = seq(-1, 1, length.out = 10)),
    variables = "nchar_align_ihs"
)


custom_colors <- c(
    "Dem20 → Dem24" = "#013388",
    "Rep20 → Rep24" = "#cc0000",
    "Ind20 → Ind24" = "mediumpurple"
)

custom_linetypes <- c(
    "Dem20 → Dem24" = "solid",
    "Rep20 → Rep24" = "solid",
    "Ind20 → Ind24" = "solid"
)

preds %>%
    filter(str_detect(group, "Dem20->Dem24|Ind20->Ind24|Rep20->Rep24")) %>%
    mutate(group = str_replace(group, "->", " → ")) %>%
    #filter(!str_detect(group, "!")) %>% 
    #filter(!str_detect(group, "Ind")) %>%
    ggplot() +
    aes(linetype = group, fill = group, color = group) +
    # geom_smooth(aes(x     = nonresp_nchar_all_log, y = estimate), method = "loess", se = FALSE) +
    geom_ribbon(
        aes(
            x     = nchar_align_ihs,
            ymin  = conf.low,
            ymax  = conf.high,
            group = group,
            color = group
        ),
        alpha = 0.2, color = NA
    ) +
    geom_smooth(
        aes(x = nchar_align_ihs, y = estimate, group = group, color = group),
        method    = "loess", 
        se        = FALSE, 
        linewidth = 0.75
    ) +
    # geom_line(
    #     aes(
    #         x     = nchar_align_ihs,
    #         y     = estimate,
    #         color = group,
    #         group = group
    #     ),
    #     linewidth = 0.5
    # ) +
    scale_color_manual(values = custom_colors) +
    scale_fill_manual(values  = custom_colors) +  # So ribbons match line color
    #scale_linetype_manual(values = custom_linetypes) +
    facet_wrap(~group) +
    scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
    scale_x_continuous(
        breaks = c(-1, 0, 1),
        labels = c("-1\n              Clinton+", "0", "1\nTrump+             ")
    ) +
    labs(
        x = "Expressive Alignment 2016",
        y = element_blank() #"Predicted Probability"
    ) +
    #theme_minimal(base_size = 14) +
    theme_book() +
    theme(
        legend.position = "none",
        #strip.text.x = element_markdown(),
        #axis.text = element_text(size = 12),
        text = element_text(size = 13)
    )
```


```{r table-party-switching-20-to-24, results = 'asis'}
## ---- table-party-switching-20-to-24 ----

cov_labels <- star_var(mout20_24, omit = "pid.*Oth")

# Capture the output as a character vector
stargazer_output <- capture.output(
    star_ft(mout20_24,
            covariate.labels = cov_labels,
            omit             = "pid.*Oth",
            no.space         = TRUE,
            #model.names      = TRUE,
            #dep.var.labels   = c("P"),
            title = "Predicted probability of Party ID stability or switching between 2020 and 2024 vs Expressive Alignment in 2016 using multinomial model with Ind20 $\\rightarrow$ Ind24 as the reference category.",
            label = "tab:table-party-switching-20-to-24" #,
            #out = "multinomial-switcher-table.tex"
    )
)

# Replace -> with $\rightarrow$
stargazer_latex_fixed <- stargazer_output %>%
    str_replace_all("-\\\\textgreater ", " $\\\\rightarrow$ ") %>%
    str_replace_all("Dem", "D") %>%
    #str_replace_all("Ind", "I") %>%
    str_replace_all("Rep", "R")

# Print to console or write to file
cat(stargazer_latex_fixed, sep = "\n")
```

\FloatBarrier


## Study 3: Validated Turnout vs Expressive Engagement {#sec:appendix-study3}


### Number of character summary statistics for Most Important Problem response by survey mode and year {#sec:summ-stats-engagement}

```{r, summ-stats-problems-2016, results = 'asis'}
# Create summary stats per mode16 for each problem
problem_summary <- bind_rows(
    a16 %>% 
        group_by(mode16) %>% 
        summarize(
            problem = "Problem 1",
            max    = max(nchar_problem1, na.rm = TRUE),
            mean   = mean(nchar_problem1, na.rm = TRUE),
            median = median(nchar_problem1, na.rm = TRUE)
        ),
    a16 %>% 
        group_by(mode16) %>% 
        summarize(
            problem = "Problem 2",
            max    = max(nchar_problem2, na.rm = TRUE),
            mean   = mean(nchar_problem2, na.rm = TRUE),
            median = median(nchar_problem2, na.rm = TRUE)
        ),
    a16 %>% 
        group_by(mode16) %>% 
        summarize(
            problem = "Problem 3",
            max    = max(nchar_problem3, na.rm = TRUE),
            mean   = mean(nchar_problem3, na.rm = TRUE),  
            median = median(nchar_problem3, na.rm = TRUE)
        ),
    a16 %>% 
        group_by(mode16) %>% 
        summarize(
            problem = "Problem 4",
            max    = max(nchar_problem4, na.rm = TRUE),
            mean   = mean(nchar_problem4, na.rm = TRUE),
            median = median(nchar_problem4, na.rm = TRUE)
        )
) %>%
    # Reshape to wide format
    pivot_wider(
        names_from  = mode16,
        values_from = c(max, mean, median),
        names_glue  = "{mode16}_{.value}"
    ) %>%
    # Rename columns to match table
    rename(
        `Problem #`  = problem,
        `FTF`        = ftf_max,
        `Web`        = web_max, 
        `FTF `       = ftf_mean,   # 1 space for uniqueness
        `Web `       = web_mean,   # 1 space for uniqueness
        `FTF  `      = ftf_median, # 2 spaces for uniqueness
        `Web  `      = web_median  # 2 spaces for uniqueness
    )

# Create formatted kable
problem_summary %>%
    kable0(
        caption = "Number of character summary statistics for Most Important Problem responses by survey mode16 (2016)",
        #format = "latex", # use "latex" if you're using a LaTeX output
        digits = 1,
        align = "lrrrrrr"
    ) %>%
    add_header_above(
        c(" " = 1, "Max" = 2, "Mean" = 2, "Median" = 2)
    ) %>% 
        footnote(
          general = "Face-to-face: n = 1,180, Web: n = 3,090.",
          #general_title = "\\\\vspace{4mm}Note: ",
          footnote_as_chunk = TRUE,
          escape = FALSE
    ) #%>%
    #row_spec(nrow(problem_summary) + 1, extra_latex_after = "\\vspace{10mm}")
     #%>%
    #kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))


```


```{r, summ-stats-problems-2020, results = 'asis'}
# Create summary stats per mode16 for each problem
problem_summary_2020 <- bind_rows(
    a20 %>% 
        group_by(mode_pre20) %>% 
        summarize(
            problem = "Problem 1",
            max    = max(text_prob20_1_nchar, na.rm = TRUE),
            mean   = mean(text_prob20_1_nchar, na.rm = TRUE),
            median = median(text_prob20_1_nchar, na.rm = TRUE)
        ),
    a20 %>% 
        group_by(mode_pre20) %>% 
        summarize(
            problem = "Problem 2",
            max    = max(text_prob20_2_nchar, na.rm = TRUE),
            mean   = mean(text_prob20_2_nchar, na.rm = TRUE),
            median = median(text_prob20_2_nchar, na.rm = TRUE)
        ),
    a20 %>% 
        group_by(mode_pre20) %>% 
        summarize(
            problem = "Problem 3",
            max    = max(text_prob20_3_nchar, na.rm = TRUE),
            mean   = mean(text_prob20_3_nchar, na.rm = TRUE),  
            median = median(text_prob20_3_nchar, na.rm = TRUE)
        ),
    a20 %>% 
        group_by(mode_pre20) %>% 
        summarize(
            problem = "Problem 4",
            max    = max(text_prob20_4_nchar, na.rm = TRUE),
            mean   = mean(text_prob20_4_nchar, na.rm = TRUE),
            median = median(text_prob20_4_nchar, na.rm = TRUE)
        )
) %>%
    # Reshape to wide format
    pivot_wider(
        names_from  = mode_pre20,
        values_from = c(max, mean, median),
        names_glue  = "{mode_pre20}_{.value}"
    ) %>%
    # Rename columns to match table (assuming 2020 has tele, video, web modes)
    rename(
        `Problem #`  = problem,
        `Tele`       = tele_max,
        `Video`      = video_max,
        `Web`        = web_max, 
        `Tele `      = tele_mean,   # 1 space for uniqueness
        `Video `     = video_mean,  # 1 space for uniqueness
        `Web `       = web_mean,    # 1 space for uniqueness
        `Tele  `     = tele_median, # 2 spaces for uniqueness
        `Video  `    = video_median,# 2 spaces for uniqueness
        `Web  `      = web_median   # 2 spaces for uniqueness
    )

# Create formatted kable
problem_summary_2020 %>%
    kable0(
        caption = "Number of character summary statistics for Most Important Problem responses by survey mode16 (2020)",
        digits = 1,
        align = "lrrrrrrrrr"  # Updated for 9 columns (1 + 3*3)
    ) %>%
    add_header_above(
        c(" " = 1, "Max" = 3, "Mean" = 3, "Median" = 3)  # Updated for 3 modes
    ) %>% 
        footnote(
          general = "Tele: n = 139, Video: n = 359, Web: n = 7,782.",
          #general_title = "\\\\vspace{4mm}Note: ",
          footnote_as_chunk = TRUE,
          escape = FALSE
    ) #%>%
    #row_spec(nrow(problem_summary) + 1, extra_latex_after = "\\\\vspace{10mm}")
    #%>%
    #kable_styling(threeparttable = TRUE)
```

#### Study 3: Interpreting Expressive Engagement {#sec:expressive-engagement}


```{r correlation-tables-prob, results = 'asis', cache = FALSE}

# vars <- c("nchar_problems_ihs",   "civic", "pol_int", "pk_scale", "engage_scale", "deliberation_scale", "discuss_any", "discuss_days_scaled", "prefers_compromise", "likely_vote16")

vars_prob <- c("nchar_problems_ihs", "civic", "pol_int", "pk_scale", "engage_scale", "discuss_pol_days", "likely_vote16")

corr_prob <- a16 %>%
    dplyr::select(vars_prob) %>%
    correlate(use = "pairwise.complete.obs") %>%
    focus(nchar_problems_ihs) %>%
    rename(Term = term,
           Correlation = nchar_problems_ihs) %>% 
    mutate(Correlation = round(Correlation, 3)) %>% 
    arrange(desc(abs(Correlation)))

corr_prob$Term <- convert_labels(corr_prob$Term, extracted = TRUE)

corr_prob %>% 
    kbl(
        align     = c("r", "r"),     
        format    = kable_format,
        booktabs  = TRUE,
        caption   = "Select Correlates of Expressive Engagement",
        #col.names = c("", "Did Not Vote", "Voted")
        linesep = ""
      ) %>% 
      kable_styling()

# latex_options = "b" or "hold_position"

```


```{r figure-A1-8-correlation-pca-plot-prob, fig.height = 4.5, fig.width = 4.5, fig.align = 'center', fig.cap = 'Principal Component Analysis suggests that Expressive Engagement (nchar\\_problems\\_ihs) loads more on the second principal component, which appears to distinguish expressive articulation (e.g., verbosity) from behavioral political action (e.g., volunteering or donating). The first component, capturing a general political engagement axis---from low (left) to high (right)---is defined primarily by political interest and knowledge. These results suggest that Expressive Engagement reflects both conventional forms of participation and a distinct, more symbolic mode of political involvement.', cache = FALSE}
# Standardize and run PCA
pca_vars_prob   <- a16[, c(vars_prob)]
pca_vars_prob   <- scale(pca_vars_prob)
pca_result_prob <- PCA(pca_vars_prob, graph = FALSE)

# Plot variables and loadings
pca_prob <- fviz_pca_var(pca_result_prob,
                           col.var = "contrib", # color by contribution to PCs
                           gradient.cols = c("blue", "red"),
                           repel = TRUE) + 
    labs(title = "Expressive Engagement PCA", 
         color = "Contribution")


pca_prob

# pca_affect + pca_prob +
        #plot_layout(guides = "collect") + 
        # plot_annotation(tag_levels = "A")

```

\FloatBarrier

#### Study 3: Validated Turnout in 2016 and 2020 vs Expressive Engagement in 2016 {#sec:turnout-2016-2020-vs-engagement}

```{r turnout-models-table, results = 'asis'}
models <- list(mod_valid_ihs[[3]], tout_all)
cov_labels <- star_var(models, omit = "pid.*Oth")
star_ft(models,
    covariate.labels = cov_labels,
    no.space         = TRUE,
    model.names      = TRUE,
    omit             = "pid.*Oth",
    dep.var.labels = c("Turnout 2016", "Turnout 2020"),
    title = "Validated Turnout incorporating match probability in 2016 and 2020 vs Expressive Engagement in 2016",
    label = "tab:turnout-models-table")
```

\FloatBarrier


\FloatBarrier


```{r turnout-vs-nonresponse-models, include = FALSE}

## original in anes_analysis.R
## ---- turnout-vs-nonresponse-models ----

mod_turn16_vs_non <- list()

mod_turn16_vs_non[[1]] <- a16 %>% 
    filter(reg16_bin == 1) %>% 
    glm(vote2016_prob ~ nonresp_lddr16_fct * pid4_16 + ft_trump_clinton + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, # + reg_intent, 
        family = quasibinomial, data = .) 
    
mod_turn16_vs_non[[2]] <- 
    a16 %>% filter(reg16_bin == 1) %>%
    glm(vote2016_prob ~ nonresp_lrdd16_fct * pid4_16 + ft_trump_clinton + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, # + reg_intent, 
        family = quasibinomial, data = .) 


mod_turn20_vs_non <- list()

mod_turn20_vs_non[[1]] <- a20 %>% 
    filter(reg16_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nonresp_lddr16_fct * pid4_16 + ft_trump_clinton + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, # + reg_intent, 
        family = quasibinomial, data = .) 
    
mod_turn20_vs_non[[2]] <- a20 %>% 
    filter(reg16_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nonresp_lrdd16_fct * pid4_16 + ft_trump_clinton + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, # + reg_intent, 
        family = quasibinomial, data = .) 


mod_turn16_vs_nonall <- a16 %>% 
    filter(reg16_bin == 1) %>%
    glm(vote2016_prob ~ nonresp_all16 * pid4_16 + ft_trump_clinton + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, # + reg_intent, 
        family = quasibinomial, data = .) 


mod_turn20_vs_nonall <- a20 %>% 
    filter(reg16_bin == 1) %>%
    glm(vote_valid20_weighted ~ nonresp_all16 * pid4_16 + ft_trump_clinton + educ16 + age16 + female16 + race16 + income16 + pol_attn16 + mode16, # + reg_intent, 
        family = quasibinomial, data = .) 

```

\FloatBarrier


```{r, express-engage-2020-models, include = FALSE}

# Continuous
mod_eng_cont_20_reg <- list()

mod_eng_cont_20_reg[[1]] <- a20 %>% 
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_ihs + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20, family = quasibinomial, data = .)

mod_eng_cont_20_reg[[2]] <- a20 %>% 
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_ihs + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20 + pol_attn20, family = quasibinomial, data = .)

mod_eng_cont_20_reg[[3]] <- a20 %>% 
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_ihs + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20 + pol_attn20 + likely_vote20, family = quasibinomial, data = .)


# Categorical
mod_eng_cat_20_reg <- list()

mod_eng_cat_20_reg[[1]] <- a20 %>% 
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_fct + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20, family = quasibinomial, data = .)

mod_eng_cat_20_reg[[2]] <- a20 %>% 
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_fct + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20 + pol_attn20, family = quasibinomial, data = .)

mod_eng_cat_20_reg[[3]] <- a20 %>% 
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_fct + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20 + pol_attn20 + likely_vote20, family = quasibinomial, data = .)


# Binary
mod_eng_bin_20_reg <- list()

mod_eng_bin_20_reg[[1]] <- a20 %>% 
    #filter(match_ok >= .9) %>%
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_bin + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20, #weights = match_ok,
        family = quasibinomial, data = .)


mod_eng_bin_20_reg[[2]] <- a20 %>% 
    #filter(match_ok >= .75) %>%
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_bin + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20 + pol_attn20, #weights = match_ok, 
        family = quasibinomial, data = .)

mod_eng_bin_20_reg[[3]] <- a20 %>% 
    #filter(match_ok >= .9) %>%
    filter(reg20_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_problems20_bin + pid4_20_fct + female20 + educ20 + age20 + race20_fct + income20 + mode_pre20 + pol_attn20 + likely_vote20, #weights = match_ok, 
        family = quasibinomial, data = .)

ggeffects::ggpredict(mod_eng_cat_20_reg[[3]], terms = "nchar_problems20_fct" )

ggeffects::ggpredict(mod_eng_bin_20_reg[[3]], terms = "nchar_problems20_bin" )

```


```{r figure-A1-9-vote20-eng20-plot-run, fig.height = 3.5, fig.width = 8, fig.cap = "Marginal effects of Expressive Engagement in 2020 on predicted probability of validated turnout in 2020. Panel A uses Model 3 from Table\ \\ref{tab:turnout20-vs-engage20-cont-table}, Panel B uses Model 3 from Table\ \\ref{tab:turnout20-vs-engage20-cat-table} and Panel C uses Model 3 from Table\ \\ref{tab:turnout20-vs-engage20-bin-table}.", include = FALSE}

# plot_model(mod_eng_bin_20_reg[[3]], type = "eff", terms = "nchar_problems20_bin")

vote20_eng20_cont_plot <-
        plot_model(mod_eng_cont_20_reg[[3]], type = "eff", terms = "nchar_problems20_ihs") +
    scale_y_continuous(limits = c(.64, .8),
                       breaks = seq(.65, .8, .05),
                       labels = scales::percent_format(accuracy = 1)) +
    labs(title = element_blank(),
         x = "Continuous", 
         y = "Pr(Validated Turnout 2020)"
    ) + 
    theme_book() #+

vote20_eng20_cat_plot <-
        plot_model(mod_eng_cat_20_reg[[3]], type = "eff", terms = "nchar_problems20_fct") +
    scale_y_continuous(limits = c(.64, .8),
                       breaks = seq(.65, .8, .05),
                       labels = scales::percent_format(accuracy = 1)) +
    scale_x_continuous(
        #limits = c(-1, 1),
        breaks = seq(1, 5, 1),
        labels = c("0", "(0,1.2]", "(1.2,1.9]", "(1.9,2.2]", "(2.2,3.9]")
    ) +
    labs(title = element_blank(),
         x = "Categorical", 
         y = element_blank() # "Pr(Validated Turnout 2020)"
    ) + 
    theme_book() +
    theme(axis.text.x = element_text(angle = 30, hjust = 1))
#, 
#                                     vjust = 0.5, hjust=1))
 

vote20_eng20_bin_plot <-
        plot_model(mod_eng_bin_20_reg[[3]], type = "eff", terms = "nchar_problems20_bin") +
    scale_y_continuous(limits = c(.64, .8),
                       breaks = seq(.65, .8, .05),
                       labels = scales::percent_format(accuracy = 1)) +
    # scale_x_continuous(
    #     limits = c(-1, 1),
    #     breaks = seq(-1, 1, 1),
    #     labels = c("-1\nClinton+", "0", "1\nTrump+")
    # ) + 
    labs(title = element_blank(),
         x = "Binary", 
         y = element_blank() #"Pr(Validated Turnout 2020)"
    ) + 
    theme_book() #+

patch <- vote20_eng20_cont_plot + vote20_eng20_cat_plot + vote20_eng20_bin_plot + 
    plot_layout(guides = "collect") &
    plot_annotation(tag_levels = "A") &
    #ylab(NULL)  & 
    theme(plot.margin = margin(5.5, 5.5, 5.5, 0))

# Use the tag label as an x-axis label
wrap_elements(panel = patch) +
  labs(tag = "Expressive Engagement 2020") +
  theme(
    #plot.tag = element_text(size = rel(1)),
    plot.tag.position = "bottom"
  )


```

#### Study 3: Validated Turnout in 2020 vs Expressive Engagement in 2020  {#sec:express-engagement-2020}


```{r figure-A1-9-vote20-eng20-plot, fig.height = 3.5, fig.width = 8, fig.cap = "Marginal effects of Expressive Engagement in 2020 on predicted probability of validated turnout in 2020. Panel A uses Model 3 from Table\\ \\ref{tab:turnout20-vs-engage20-cont-table}, Panel B uses Model 3 from Table\\ \\ref{tab:turnout20-vs-engage20-cat-table} and Panel C uses Model 3 from Table\\ \\ref{tab:turnout20-vs-engage20-bin-table}.", include = TRUE}
wrap_elements(panel = patch) +
  labs(tag = "Expressive Engagement 2020") +
  theme(plot.tag.position = "bottom")
```

\begin{footnotesize}
\begin{eqnarray}
\begin{cases}
\end{cases}
\label{eq:expressive-engagement-bin}
\end{eqnarray}
\end{footnotesize}


\begin{footnotesize}
\begin{eqnarray}
\begin{cases}
\end{cases}
\label{eq:expressive-engagement-cat}
\end{eqnarray}
\end{footnotesize}
    

```{r, figure-A1-10-express-engage-hist-1620, fig.cap = "Distribution of Expressive Engagement across modes in 2016 and 2020. Each histogram represents the distribution of IHS-normalized total character count across four open-ended `most important problem' responses. The spikes at zero reflect nonresponse to all items; in 2020, a secondary peak just below 1 captures minimal engagement (typically one short answer); and the broader distribution represents fuller engagement. Note: As focus is on comparing distributions, the $y$-axes vary. In 2016, web $N$ = 3090, face-to-face $n$ = 1180. In 2020, web $n$ = 7,782, video $n$ = 359, tele $n$ = 139."}

# --- 2016: Web and FTF ---
p_a16_web <- a16 %>% 
  filter(mode16 == "web") %>%
  ggplot(aes(x = nchar_problems_ihs)) +
  geom_histogram(bins = 30, fill = "gray70", color = "black") +
  labs(y = "web", x = NULL, title = "2016") +
  theme_minimal() +
  theme(#axis.text.y = element_blank(),
        #axis.ticks.y = element_blank(),
        axis.title.y = element_text(angle = 0, vjust = 0.5)) +
  scale_x_continuous(breaks = seq(0, 4, by = 1), limits = c(-0.1, 4))


p_a16_ftf <- a16 %>% 
  filter(mode16 == "ftf") %>%
  ggplot(aes(x = nchar_problems_ihs)) +
  geom_histogram(bins = 30, fill = "gray70", color = "black") +
  labs(y = "ftf", x = NULL) +
  theme_minimal() +
  theme(#axis.text.y = element_blank(),
        #axis.ticks.y = element_blank(),
        axis.title.y = element_text(angle = 0, vjust = 0.5)) +
  scale_x_continuous(breaks = seq(0, 4, by = 1), limits = c(-0.1, 4))


# --- 2020: Web, Video, Tele ---
p_a20_web <- a20 %>%
  filter(mode_pre20 == "web") %>%
  ggplot(aes(x = nchar_problems20_ihs)) +
  geom_histogram(bins = 30, fill = "gray70", color = "black") +
  labs(y = "web", x = NULL, title = "2020") +
  theme_minimal() +
  theme(#axis.text.y = element_blank(),
        #axis.ticks.y = element_blank(),
        axis.title.y = element_text(angle = 0, vjust = 0.5)) +
  scale_x_continuous(breaks = seq(0, 4, by = 1), limits = c(-0.1, 4))


p_a20_video <- a20 %>%
  filter(mode_pre20 == "video") %>%
  ggplot(aes(x = nchar_problems20_ihs)) +
  geom_histogram(bins = 30, fill = "gray70", color = "black") +
  labs(y = "video", x = NULL) +
  theme_minimal() +
  theme(#axis.text.y = element_blank(),
        #axis.ticks.y = element_blank(),
        axis.title.y = element_text(angle = 0, vjust = 0.5)) +
  scale_x_continuous(breaks = seq(0, 4, by = 1), limits = c(-0.1, 4))


p_a20_tele <- a20 %>%
  filter(mode_pre20 == "tele") %>%
  ggplot(aes(x = nchar_problems20_ihs)) +
  geom_histogram(bins = 30, fill = "gray70", color = "black") +
  labs(y = "tele", x = NULL ) + #"nchar (IHS transformed) 2020") +
  theme_minimal() +
  theme(#axis.text.y = element_blank(),
        #axis.ticks.y = element_blank(),
        axis.title.y = element_text(angle = 0, vjust = 0.5)) +
  scale_x_continuous(breaks = seq(0, 4, by = 1), limits = c(-0.1, 4))


# Stack each column
r2016_stack <- (p_a16_web / p_a16_ftf / plot_spacer()) + 
  plot_layout(heights = c(1, 1, 1))

r2020_stack <- (p_a20_web / p_a20_video / p_a20_tele) + 
  plot_layout(heights = c(1, 1, 1))

# Combine columns with explicit layout
# (r2016_stack | r2020_stack) + 
#   plot_layout(ncol = 2) #+ 
#     plot_layout(guides = "collect") #+
#     #plot_annotation(tag_levels = "A")


# Combine with shared x-axis label
final_plot <- (r2016_stack | r2020_stack) + 
  plot_layout(ncol = 2) & 
  theme(plot.margin = margin(5, 5, 5, 5))  # ensure spacing is OK


final_plot + 
    plot_annotation(
        theme = theme(
            plot.caption = element_text(hjust = 0.5, size = 12), #face = "bold"),
            plot.margin = margin(5, 5, 20, 5)
        ),
        caption = "Expressive Engagement"
    )
    
    
```


```{r anes2016-mip-answered, results = 'asis'}
tab_a16n <- a16 %>% 
    tabyl(mode16, nchar_prob_bin) %>% 
    (t)

tab_a16perc <- a16 %>% 
    tabyl(mode16, nchar_prob_bin) %>% 
    adorn_percentages() %>% 
    adorn_pct_formatting(0) %>% 
    t()

df <- data.frame(ftfn = tab_a16n[-1,1], 
                 ftfp = tab_a16perc[-1,1], 
                 webn = tab_a16n[-1,2], 
                 webp = tab_a16perc[-1,2])

df %>% 
    as.data.frame() %>% 
    #slice(-1) %>% 
    # rename(`ANES 2020 MIP Answered` = text_prob20_bin,
    #        `N` = n,
    #        `Percent` = percent) %>% 
    kbl(
    align = "r",
    format = kable_format,
    booktabs = TRUE,
    caption = "ANES 2016 `Most Important Problems' Answered \\label{tab:a16-mip-answered}",
    col.names = c("# MIP Answered", "n", "Percent", "n", "Percent")
    ) %>% 
  add_header_above(header = c(" " = 1, "Face-to-Face" = 2, "Web" = 2), align = "c") %>% 
  kable_styling(latex_options = "hold_position")
```


```{r anes2020-mip-answered, results = 'asis'}
# a20 %>% 
#     tabyl(mode_pre20, text_prob20_bin) %>%
#     adorn_percentages() %>% 
#     adorn_pct_formatting(0) %>% 
#     t() %>% 
#     as.data.frame() %>% 
#     slice(-1) %>% 
#     # rename(`ANES 2020 MIP Answered` = text_prob20_bin,
#     #        `N` = n,
#     #        `Percent` = percent) %>% 
#     kbl(
#     format = kable_format,
#     booktabs = TRUE,
#     caption = "ANES 2020 'Most Important Problems' Answered \\label{tab:a20-mip-answered}",
#     col.names = c("# MIP Answered", "Tele", "Video", "Web")
#     ) %>% 
#   kable_styling(latex_options = "hold_position")


tab_a20n <- a20 %>% 
    tabyl(mode_pre20, text_prob20_bin) %>% 
    (t)

tab_a20perc <- a20 %>% 
    tabyl(mode_pre20, text_prob20_bin) %>% 
    adorn_percentages() %>% 
    adorn_pct_formatting(0) %>% 
    t()

df <- data.frame(telen  = tab_a20n[-1,1], 
                 telep  = tab_a20perc[-1,1], 
                 videon = tab_a20n[-1,2], 
                 videop = tab_a20perc[-1,2],
                 webn   = tab_a20n[-1,3], 
                 webp   = tab_a20perc[-1,3])

df %>% 
    as.data.frame() %>% 
    #slice(-1) %>% 
    # rename(`ANES 2020 MIP Answered` = text_prob20_bin,
    #        `N` = n,
    #        `Percent` = percent) %>% 
    kbl(
    align = "r",
    format = kable_format,
    booktabs = TRUE,
    caption = "ANES 2020 `Most Important Problems' Answered \\label{tab:a20-mip-answered}",
    col.names = c("# MIP Answered", "n", "Percent", "n", "Percent", "n", "Percent")
    ) %>% 
  add_header_above(header = c(" " = 1, "Tele" = 2, "Video" = 2, "Web" = 2), align = "c") %>% 
  kable_styling(latex_options = "hold_position")
```


```{r turnout20-vs-engage20-cont-table, results = 'asis'}

cov_labels <- star_var(mod_eng_cont_20_reg, omit = c("pid.*Oth"))

star_ft(mod_eng_cont_20_reg,
        covariate.labels = cov_labels,
        omit             = c("pid.*Oth"),
        no.space         = TRUE,
        model.names      = TRUE,
        dep.var.labels = c("Validated Turnout 2020"),
        title = "Validated turnout incorporating match probability in 2020 vs Expressive Engagement 2020 (Continuous) using quasibinomial models",
        label = "tab:turnout20-vs-engage20-cont-table"#,
        # add.lines = list(
        #     c("Subset for Registered Voters?", rep("\\text{No}", 4) ) #,
        #     # c("Weighted by ", rep("\\text{No}", 4) )
        #    ),
        # notes = c("*$p<0.05$")
    )

# , rep("\\text{Yes}", 3)

```


```{r turnout20-vs-engage20-cat-table, results = 'asis'}

cov_labels <- star_var(mod_eng_cat_20_reg, omit = c("pid.*Oth"))

star_ft(mod_eng_cat_20_reg,
        covariate.labels = cov_labels,
        omit             = c("pid.*Oth"),
        no.space         = TRUE,
        model.names      = TRUE,
        dep.var.labels = c("Validated Turnout 2020"),
        title = "Validated turnout incorporating match probability in 2020 vs Expressive Engagement 2020 (Categorical) using quasibinomial models",
        label = "tab:turnout20-vs-engage20-cat-table"#,
        # add.lines = list(
        #     c("Subset for Registered Voters?", rep("\\text{No}", 4) ) #,
        #     # c("Weighted by ", rep("\\text{No}", 4) )
        #    ),
        # notes = c("*$p<0.05$")
    )

# , rep("\\text{Yes}", 3)

```


```{r turnout20-vs-engage20-bin-table, results = 'asis'}

cov_labels <- star_var(mod_eng_bin_20_reg, omit = c("pid.*Oth"))

star_ft(mod_eng_bin_20_reg,
        covariate.labels = cov_labels,
        omit             = c("pid.*Oth"),
        no.space         = TRUE,
        model.names      = TRUE,
        dep.var.labels = c("Validated Turnout 2020"),
        title = "Validated turnout incorporating match probability in 2020 vs Expressive Engagement 2020 (Binary) using quasibinomial models.",
        label = "tab:turnout20-vs-engage20-bin-table"#,
        # add.lines = list(
        #     c("Subset for Registered Voters?", rep("\\text{No}", 1), rep("\\text{Yes}", 3))
        #    ),
        # notes = c("*$p<0.05$")
    )


```

\FloatBarrier


```{r turnout-partisan-models, include = FALSE, eval = TRUE}
mod_valid_express_partisan <- list()


mod_valid_express_partisan[[1]] <-
    a16 %>% filter(reg16_bin == 1) %>% 
    glm(vote2016_prob ~ nchar_align_ihs * pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16, 
                      family = quasibinomial, data = . )


mod_valid_express_partisan[[2]] <- 
    a20 %>% filter(reg16_bin == 1) %>% 
    glm(vote2016_prob ~ nchar_align_ihs * pid4_16 + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16,
        family = quasibinomial,
        data = .) 


mod_valid_express_partisan[[3]] <- a20 %>% 
    filter(reg16_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16, 
                      family = quasibinomial, data = . )


mod_valid_express_partisan[[4]] <- a20 %>% 
    filter(reg16_bin == 1) %>% 
    glm(vote_valid20_weighted ~ nchar_align_ihs * pid4_16 + ft_trump_clinton + female16 + educ16 + age16 + race4_16 + income16 + pol_attn16 + mode16,
        family = quasibinomial,
        data = .) 


```


```{r include = FALSE}

ggpredict(mod_valid_express_partisan[[1]], terms = c("nchar_align_ihs", "pid4_16[Dem, Ind, Rep]"))

ggpredict(mod_valid_express_partisan[[2]], terms = c("nchar_align_ihs", "pid4_16[Dem, Ind, Rep]"))

```

#### Study 3: Validated Turnout vs Expressive Alignment {#sec:turnout-vs-express-partisan}


```{r figure-A1-11-turnout-partisan-plot, fig.height = 3.75, fig.width = 8, fig.cap = "Marginal effects of Expressive Alignment in 2016 on validated turnout in 2016 and 2020 with logistic regression model (see Table\ \\ref{tab:turnout-alignment}).", include = TRUE, cache = cache_lgl}


# , colors = "gs"
#pp[[1]] <- 
    
turnout_partisan_plot16 <-
    plot_model(mod_valid_express_partisan[[1]], type = "pred", terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Unreg = "#cbcaca")) +
    aes(linetype = group, color = group) +
    scale_y_continuous(#limits = c(.18, 1),
                       labels = scales::percent_format(accuracy = 1)) +
    scale_x_continuous(
        limits = c(-1, 1),
        breaks = seq(-1, 1, 1),
        labels = c("-1\nClinton+", "0", "1\nTrump+")
    ) + 
    labs(title = element_blank(),
         x = "Expressive Alignment 2016", 
         y = "Pr(Validated Turnout 2016)",
         color = "Party ID",
         linetype = "Party ID") +
    theme_book() #+

turnout_partisan_plot20 <-
    plot_model(mod_valid_express_partisan[[2]], type = "pred", terms = c("nchar_align_ihs", "pid4_16 [Dem, Ind, Rep]"), colors = c(Dem = "#013388", Rep = "#cc0000", Ind = "mediumpurple", Unreg = "#cbcaca")) +
    aes(linetype = group, color = group) +
    scale_y_continuous(#limits = c(.18, 1),
                       labels = scales::percent_format(accuracy = 1)) +
    scale_x_continuous(
        limits = c(-1, 1),
        breaks = seq(-1, 1, 1),
        labels = c("-1\nClinton+", "0", "1\nTrump+")
    ) + 
    labs(title = element_blank(),
         x = "Expressive Alignment 2016", 
         y = "Pr(Validated Turnout 2020)",
         color = "Party ID",
         linetype = "Party ID") +
    theme_book() #+

    
turnout_partisan_plot16 + turnout_partisan_plot20 +
    plot_layout(guides = "collect") + 
    plot_annotation(tag_levels = "A")

```


```{r turnout-partisan-table, results = 'asis'}


# models <- list(aff_vs_rr16, aff_vs_rr20)

cov_labels <- star_var(mod_valid_express_partisan, omit = c("pid.*Oth"))

star_ft(mod_valid_express_partisan,
        covariate.labels = cov_labels,
        omit             = c("pid.*Oth"),
        no.space         = TRUE,
        model.names      = TRUE,
        dep.var.labels   = c("Turnout 2016", "Turnout 2020", "Turnout 2016", "Turnout 2020"),
        title = "Predicted probability of validated turnout incorporating match probability in 2016 and 2020 vs Expressive Alignment in 2016.",
        label = "tab:turnout-alignment",
        #notes = c("Race variable included in models but omitted above for space.", "*$p<0.05$"),
        add.lines = list(
            c("Feeling Thermometer?", rep("\\text{No}", 2), rep("\\text{Yes}", 2))
        )
    )

```

\FloatBarrier

## Study 4: Text as Multilingual Instrument Validation {#sec:appendix-study4}

\FloatBarrier

#### Study 4: Table for Afrobarometer Summary Statistics {#sec:afro-summary-stats}


```{r lang-summary-stats-table-afro, results = 'asis', cache = FALSE, include = TRUE}

afro %>% 
    mutate(lang = str_to_title(lang)) %>% 
    tabyl(lang) %>% 
    adorn_pct_formatting(digits = 0) %>% 
    adorn_totals(where = "row") %>% 
    mutate(n = add_comma(n)) %>% 
    select(-valid_percent) %>% 
    kable0(
      col.names = c("Language", "n", "Percent"), #, "Valid Percent"),
      escape    = TRUE, 
      caption   = "Frequency table of Languages within Afrobarometer",
      align = c("l", "r", "r"),
      linesep = c("", "", "", "\\addlinespace")
    ) %>% 
    sub("\\\\toprule",    "", .) %>%
    #sub("\\\\midrule",    "", .) %>% 
    sub("\\\\bottomrule", "", .) 


```

\FloatBarrier

#### Study 4: Table for Afrobarometer Regression Results {#sec:afro-regression-table}

```{r regression2-afro-models, include = FALSE}


## ----  regression2-afro-models ----
# regressing by language (separate models)
mod_eng <- afro %>%
  filter(lang == "English") %>%
  filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>%
  glm.nb(nchar_dem_total ~ dem_importance + understood_dem + gender + educ + age + income_proxy + race5, data = .)

mod_french <- afro %>%
  filter(lang == "French") %>%
  filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>%
  glm.nb(nchar_dem_total ~ dem_importance + understood_dem + gender + educ + age + income_proxy + race5, data = .)

mod_port <- afro %>%
  filter(lang == "Portuguese") %>%
  filter(understood_dem != "Did not understand" & understood_dem != "Missing") %>%
  glm.nb(nchar_dem_total ~ dem_importance + understood_dem + gender + educ + age + income_proxy + race5, data = .)


```


```{r sum2-afro-table, results = "asis", cache = cache_lgl}


## ----  sum2-afro-table, results = 'asis' ----

models <- list(mod_eng, mod_french, mod_port, mod_dem_int)

cov_labels <- star_var(models)


star0(models,
      covariate.labels = cov_labels,
      omit.stat        = "theta",
      dep.var.labels   = c("Number of Characters: `What Democracy Means to You?'"), 
      column.labels    = c("English", "French", "Portuguese", "Interaction"),
      title            = "Number of characters about democracy means vs democracy battery interacted with language",
      label            = "tab:sum2-afro-table"
      )

```

\FloatBarrier


#### Study 4: Afrobarometer Questions {#sec:afro-questions}

\FloatBarrier

#### Study 4: Importance of Democracy Scale: Sum vs. Mean {#sec:afro-democracy-scale}


```{r figure-A1-12-afro-raw-data, fig.height=4, dev='png', fig.cap = "Scatter plot of raw data showing number of characters about 'What democracy means to you?' versus Importance of Democracy Scale with smoothed Loess curves by language.", dpi = 150, eval = TRUE, cache = TRUE}

# afro_subsampled <- bind_rows(
#   afro %>% filter(lang != "English"),
#   afro %>% filter(lang == "English") %>% sample_frac(0.5)
# )

afro %>% 
  #sample_frac(.25) %>% 
  filter(!is.na(lang)) %>% 
  ggplot() + 
    aes(x = dem_importance, y = nchar_dem_total, color = lang) +    
    geom_jitter(width = .075, height = .05, alpha = 0.05) + 
    geom_smooth(method = "loess") +
    labs(y = "# Characters about\n'What democracy means to you?'",
         x = "Importance of Democracy Scale",
         color = "Language") +
    scale_x_continuous(#limits = c(0, 50), #breaks = seq( 10, 20, 30, 40),
                       labels = c("   Low", "", "", "", "", "High   ")) + 
#     scale_color_manual(values = c(
#       "English"    = "#D95F02",     # softer orange
#       "French"     = "#1B9E77",
#       "Portuguese" = "#7570B3"
# )) +
                       theme_book()


# afro_subsampled %>% 
#   #sample_frac(.25) %>% 
#   filter(!is.na(lang)) %>% 
#   ggplot() + 
#     aes(x = dem_importance, y = nchar_dem_total, color = lang) +    
#     geom_jitter(data = afro,
#               aes(x = dem_importance, y = nchar_dem_total),
#               color = ifelse(afro$lang == "English", "#A95F02", NA), # manually set English
#               alpha = 0.05, width = 0.075, height = 0.075) +
#     geom_jitter(data = afro %>% filter(lang != "English"),
#               aes(x = dem_importance, y = nchar_dem_total, color = lang),
#               alpha = 0.05, width = 0.075, height = 0.075) +
#     geom_smooth(method = "loess") +
#     labs(y = "# Characters about\n'What democracy means to you?'",
#          x = "Importance of Democracy Scale",
#          color = "Language") +
#     scale_x_continuous(#limits = c(0, 50), #breaks = seq( 10, 20, 30, 40),
#                        labels = c("   Low", "", "", "", "", "High   ")) + 
#     scale_color_manual(values = c(
#       "English"    = "#D95F02",     # softer orange
#       "French"     = "#1B9E77",
#       "Portuguese" = "#7570B3"
# )) +
#                        theme_book()


```

\FloatBarrier

## Study 5: Text as Manipulation Check of Social Exclusion {#sec:appendix-study5} 


```{r figure-A1-14-asian-white-nchar-time-cropped, eval = TRUE, fig.cap = "Relationship between total characters written and total survey time (minutes) by race and treatment condition. Smoothed loess curves highlight that for Asian American respondents, writing output plateaus or declines at longer durations, consistent with a potential 'writer's block' effect. Note: one white control subject who wrote more than 2000 characters is cropped for better visualization (loess curves remain unchanged).", out.width = '80%', fig.align = 'center'}

## ---- figure-A1-14-asian-white-nchar-time-cropped ----
# 1500 cutoff on y-axis
aap %>% 
    filter(!is.na(asian_white_cond)) %>% 
  ggplot() + 
  aes(x = time_min, y = nchar_sum) + 
  geom_point() + 
  geom_smooth(aes( color = NA), color = 'blue3', method = "loess", se = FALSE, span = 1) + 
  facet_wrap(~asian_white_cond, ncol = 3, dir = "v") +
  #theme(axis.text.x = element_text(angle = 30, hjust = 1))
  #scale_y_continuous(limits = c(0, 2500))
  labs(x = "Time (min)",
       y = "Total Number of Characters",
       #caption = "\nNote: one white control with >2000 cropped for visualization"
       ) +
    coord_cartesian(ylim=c(0, 1500)) +
    theme_minimal(base_size = 13)

  
```


```{r aap-wilcox-table, results = 'asis'}

## ---- aap-wilcox-table, results = 'asis' ----

# signicant differences in overall time by race and condition
wilcox_white <- aap %>% 
    filter(white == 1) %>% 
    wilcox.test(data = ., time_min ~ treatment_cit) %>% 
    broom::tidy() 

wilcox_asian <- aap %>% 
    filter(asian == 1) %>%  
    wilcox.test(data = ., time_min ~ treatment_cit) %>% 
    broom::tidy() 

# wilcox_nonwhite <- aap %>% 
#     filter(white == 0) %>%  
#     wilcox.test(data = ., time_min ~ treatment_cit) %>% 
#     broom::tidy() 


# summary_table <- aap %>% 
#     filter(!is.na(asian_white_cond)) %>% 
#     group_by(asian_white_other_cond) %>% 
#     summarize(
#       n    = n(),
#       Min  = min(time_min),
#       Mean = mean(time_min) %>% round(2),
#       Max  = max(time_min),
#       sd   = sd(time_min) %>% round(2)
#     ) %>% 
#     kbl(
#         format   = 'latex', 
#         col.names = c("Race and Condition", "n", "Min", "Mean", "Max", "sd"),
#         caption  = "Summary statistics of total minutes in study, by race and condition",
#         booktabs = TRUE,
#         linesep  = "") %>% 
#     sub("\\\\toprule",    "", .) %>%
#     #sub("\\\\midrule",    "", .) %>% 
#     sub("\\\\bottomrule", "", .) 
# 
# 
# summary_table
# 
# save_kable(summary_table, file = here("text_code", "summary-table.tex"))

wilcox_all <- rbind(wilcox_white, wilcox_asian) #, wilcox_nonwhite)

wilcox_all$test <- c(
    "White Treated vs White Control",
    "Asian-Am Treated vs Asian-Am Control"#   ,
#    "Nonwhite Treated vs Nonwhite Control"
)

wilcox_table <- wilcox_all %>% 
    select(test, statistic, p.value) %>% 
    mutate(p.value = round(p.value, 4)) %>% 
    rename(Group = test,
           `Wilcoxon $W$ Statistic` = statistic,
           `$p$-value` = p.value) %>% 
    kbl(
        format = 'latex', 
        caption = "Wilcoxon Rank Sum Test of Total Time in Study, by Race and Condition",
    booktabs = TRUE,
    escape   = FALSE) %>% 
    kable_styling(latex_options = "hold_position") %>% 
    sub("\\\\toprule",    "", .) %>%
    #sub("\\\\midrule",    "", .) %>% 
    sub("\\\\bottomrule", "", .) 


wilcox_table

# save_kable(wilcox_table, file = here("text_code", "asian-wilcox-table.tex"))


```


```{r aap-gamma-race-treat, results = 'asis'}
mod_gamma <- aap %>%  
    glm(
        time_min ~ asian_white_fct * treatment_fct,
        family = Gamma(link = "log"),
        data = .) 

cov_labels <- star_var(mod_gamma)

star_ft(mod_gamma,
        covariate.labels = cov_labels,
        #no.space         = TRUE,
        #model.names      = TRUE,
        #omit             = "pid.*Oth",
        dep.var.labels = c("Time (min)"),
    title = "Log-linked Gamma model of time vs race and treatment condition.",
    label = "tab:aap-gamma-race-treat")

```


```{r aap-nonlinear-rate, results = 'asis'}

## ---- nonlinear-rate-analysis-models, results = 'asis' ----

nonlinear_mod_white_asian <- aap %>%  
    glm.nb(
        nchar_sum ~ asian_white_fct * treatment_fct +
            time_min + I(time_min^2) +
            asian_white_fct:I(time_min^2) + treatment_fct:I(time_min^2) +
            asian_white_fct:treatment_fct:I(time_min^2) +
            offset(log(time_min)),
        data = .)


## ---- nonlinear-rate-analysis-table, results = 'asis' ----

cov_labels <- star_var(nonlinear_mod_white_asian)

stargazer(nonlinear_mod_white_asian, 
          covariate.labels = cov_labels,
          dep.var.labels = "Total Characters",
          digits = 4,
          header = FALSE,
          model.names      = TRUE,
          type = star_format,
          align = TRUE,
          font.size = 'scriptsize',
          star.cutoffs = star_cut_vector,
          notes.append = FALSE,
          notes = "*$p<0.05$",
          omit.stat = c("theta"),
          title = "Negative binomial with log(time) offset. Coefficients represent effects on the expected rate of characters written per minute as a function of race, treatment, and quadratic time (race $\\times$ treatment $\\times$ time$^2$).",
          label = "tab:aap-nonlinear-rate"
)

```

# \doublespacing

\FloatBarrier

## Methods: Additional Data Processing and Modeling Considerations  {#sec:appendix-methods}


\FloatBarrier

#### Methods: Tests for differences in Expressive Alignment and Expressive Engagement, by mode in 2016 {#sec:mode-tests} 


```{r mode-ttest-table-build, include = FALSE}
# Ensure sign convention: positive = FtF − Web
a16 <- a16 |> mutate(mode16 = fct_relevel(mode16, "ftf", "web"))

# ---- Run Welch t-tests ----
tt_align <- t.test(nchar_align_ihs ~ mode16, data = a16, var.equal = FALSE)
tt_align_tidy <- tidy(tt_align)

tt_mip <- t.test(nchar_problems_ihs ~ mode16, data = a16, var.equal = FALSE)
tt_mip_tidy <- tidy(tt_mip)

# ---- Collect t-test results into a tibble ----
tt_tbl <- bind_rows(
    tt_align_tidy |> mutate(Measure = "Expressive Alignment"),
    tt_mip_tidy   |> mutate(Measure = "Expressive Engagement")
) |>
    select(Measure,
           t = statistic,
           df = parameter,
           p = p.value,
           estimate, estimate1, estimate2)  
           # estimate = mean(ftf) - mean(web) given factor relevel

# ---- Compute raw mean deltas explicitly (FtF − Web) in IHS units ----
# 
means_tbl <- a16 |>
    select(mode16, nchar_align_ihs, nchar_problems_ihs) |>
    pivot_longer(-mode16, names_to = "var", values_to = "value") |>
    group_by(var, mode16) |>
    summarise(mean = mean(value, na.rm = TRUE),
              sd   = sd(value, na.rm = TRUE),
              n    = sum(is.finite(value)),
              .groups = "drop") |>
    pivot_wider(names_from = mode16, values_from = c(mean, sd, n)) |>
    mutate(
        Measure = dplyr::recode(var,
                                nchar_align_ihs   = "Expressive Alignment",
                                nchar_problems_ihs   = "Expressive Engagement"
        ),
        delta_raw = mean_ftf - mean_web  # FtF − Web
    ) |>
    select(Measure, delta_raw, mean_ftf, mean_web, sd_ftf, sd_web, n_ftf, n_web)

# ---- Cohen's d via effectsize (FtF − Web) ----
# 
d_align <- cohens_d(nchar_align_ihs ~ mode16, data = a16, pooled_sd = TRUE, ci = 0.95) |> as.data.frame()
d_mip   <- cohens_d(nchar_problems_ihs ~ mode16, data = a16, pooled_sd = TRUE, ci = 0.95) |> as.data.frame()

d_tbl <- bind_rows(
    d_align |> mutate(Measure = "Expressive Alignment"),
    d_mip   |> mutate(Measure = "Expressive Engagement")
) |>
    transmute(
        Measure,
        d = Cohens_d,
        d_low = CI_low,
        d_high = CI_high,
        magnitude = interpret_cohens_d(d, rules = "cohen1988")
    )

# ---- Assemble final appendix table ----

final_tbl <- tt_tbl |>
    left_join(means_tbl, by = "Measure") |>
    left_join(d_tbl,     by = "Measure") |>
    transmute(
        Measure,
        t  = round(t, 2),
        df = round(df, 0),
        p  = ifelse(p < .001, "<0.001", sprintf("%.3f", p)),
        `mean $\\Delta$ (FtF-Web)` = paste0("$\\approx$ ", sprintf("%.3f", delta_raw)),
        `d $\\approx$` = sprintf("%.2f", d),
        Interpretation = case_when(
            abs(d) < 0.10 ~ "no meaningful difference",
            abs(d) < 0.20 ~ "tiny, statistically detectable",
            abs(d) < 0.50 ~ "small",
            abs(d) < 0.80 ~ "medium",
            TRUE          ~ "large"
        )
    )
```


```{r mode-ttest-table-print, results = 'asis'}
final_tbl %>%
    kbl(align = "lrrrrrl", 
        booktabs = TRUE,
        escape   = FALSE,
        caption = "Mode differences (FtF - Web) for transformed metadata measures. Cohen's $d$ calculated from the observed means and pooled $SD$.") %>%
    kable_styling(full_width = FALSE)

#final_tbl

```


```{r figure-A1-15-mode-ggridges-plot-align, fig.height = 2.5, fig.width = 5, fig.cap = "Distribution of Expressive Alignment 2016 by mode."}
a16 %>% 
    mutate(Mode = case_when(
        mode16 == "ftf" ~ "FtF",
        mode16 == "web" ~ "Web")
    ) %>% 
    ggplot() +
    aes(x = nchar_align_ihs, y = Mode, fill = Mode) +
    geom_density_ridges(
        stat = "binline",      
        bins = 40,             
        scale = 1.15,
        alpha = 0.7,
        color = "white",
        size = 0.2
    ) +
    labs(
        x = "Expressive Alignment 2016",
        y = NULL,
        #title = "Ridge histograms of transformed character counts by mode16"
    ) +
    theme_book()
```


```{r figure-A1-16-mode-ggridges-plot-engage, fig.height = 2.5, fig.width = 5, fig.cap = "Distribution of Expressive Engagement 2016 by mode."}

a16 %>% 
    mutate(Mode = case_when(
        mode16 == "ftf" ~ "FtF",
        mode16 == "web" ~ "Web")
    ) %>% 
    ggplot() +
    aes(x = nchar_problems_ihs, y = Mode, fill = Mode) +
    geom_density_ridges(
        stat = "binline",      
        bins = 40,             
        scale = 1.15,
        alpha = 0.7,
        color = "white",
        size = 0.2
    ) +
    labs(
        x = "Expressive Engagement 2016",
        y = NULL,
        #title = "Ridge histograms of transformed character counts by mode16"
    ) +
    theme_book()
```

\clearpage

\clearpage

\clearpage 

